ILO modelled estimates and projections
Data considerations and methodological approach
Impact of the pandemic on ILO modelled estimates and projections
The ILO actively maintains a series of econometric models that are used to produce estimates of labour market indicators in the countries and years for which country-reported data are unavailable and to produce forecasts (see descriptions below). The model inputs are historical time series data. The unprecedented labour market shock created by the COVID-19 pandemic is difficult to assess by benchmarking against historical data. As such, most of the series in the ILO modelled estimates and projections dataset now end in 2019 (the latest year for which annual labour force survey data were available at the time of production of the estimates). For some indicators, a nowcasting model is used to provide 2020 estimates and a new projection model is used to forecast 2021 estimates. Given the exceptional situation, including the scarcity of relevant data, the 2020-21 estimates are subject to a substantial amount of uncertainty.
For more information on the nowcasting and projection models, refer to the annexes of ILO Monitor: COVID-19 and the world of work.
The ILO modelled estimates series provides a complete set of internationally comparable labour statistics, including both nationally-reported observations and imputed data for countries with missing data. The imputations are produced through a series of econometric models maintained by the ILO. The purpose of estimating labour market indicators for countries with missing data is to obtain a balanced panel data set so that, every year, regional and global aggregates with consistent country coverage can be computed. These allow the ILO to analyse global and regional estimates of key labour market indicators and related trends. Moreover, the resulting country-level data, combining both reported and imputed observations, constitute a unique, internationally comparable data set on labour market indicators.
Estimates for countries with very limited labour market information have a high degree of uncertainty. Hence, estimates of labour market indicators for countries with limited nationally reported data should not be considered as “observed” data, and great care needs to be applied when using these data for analysis, especially at the country level.
Below you can find more information on the ILO modelled estimates, for more details check this methodological overview.
Table of contents
Data collection and evaluation for ILO modelled estimates
The ILO modelled estimates are generally derived for 189 countries, disaggregated by sex and age as appropriate. Additionally, for selected indicators an additional disaggregation by geographical area (urban and rural) is performed. Before running the models to obtain the estimates, labour market information specialists from the ILO Department of Statistics, in cooperation with the Research Department, evaluate existing country‑reported data and select only those observations deemed sufficiently comparable across countries.
The recent efforts by the ILO to produce harmonized indicators from country-reported microdata have greatly increased the comparability of the observations. Nonetheless, it is still necessary to select the data on the basis of the following criteria:
- Type of data source
In order for data to be included in the model, they must be derived from either a labour force survey, other sufficiently comparable household survey, or a population census. National labour force surveys are generally similar across countries, and the data derived from these surveys are more readily comparable than data obtained from other sources. A strict preference is given to labour force survey-based data in the selection process. However, many developing countries which lack the resources to carry out a labour force survey do report labour market information based on other types of household surveys or population censuses. Consequently, due to the need to balance the competing goals of data comparability and data coverage, some other survey-based estimates and population census-based data are included.
- Geographic coverage
Only nationally representative (i.e. not prohibitively geographically limited) labour market indicators are included. Observations that correspond to only urban or rural areas are not included, as large differences typically exist between rural and urban labour markets, and using only rural or urban data would not be consistent with benchmark data such as GDP.
- Age group coverage
The age groups covered by the observed data must be sufficiently comparable across countries. Countries report labour market information for a variety of age groups and the age group selected can have an influence on the observed value of a given labour market indicator.
The data used to prepare the World Employment and Social Outlook: Trends are gathered by countries which regularly submit employment statistics to the ILO. Conducting surveys is a complicated and costly task which some countries are unable to do on a systematic basis. To compensate, the ILO has developed statistical models in order to fill the gaps for countries in years for which no data have been reported. These models have been tested for statistical accuracy and allow the ILO to forecast changes in key labour market indicators as well as to produce global and regional aggregates. The end result of these models is a complete set of national labour statistics alongside the global and regional aggregates.
Not all countries submit statistically comparable data. Before running the models to obtain the estimates, ILO labour market information specialists evaluate existing country-reported data and select only those observations deemed sufficiently comparable across countries. The recent efforts by the ILO to produce harmonized indicators from country-reported microdata have greatly increased the comparability of the observations. Nonetheless, it is still necessary to select the data on the basis of the following four criteria: (a) type of data source; (b) geographical coverage; (c) age-group coverage; and (d) presence of methodological breaks or outliers.
Our models also include country-level data on population, economic growth, poverty and other economic indicators from the following sources:
- United Nations World Population Prospects
- IMF/World Bank data on macroeconomic indicators
- World Bank poverty estimates from the PovcalNet database
The estimates are produced using a series of models, which establish statistical relationships between observed labour market indicators and explanatory variables. These relationships are used to impute missing observations and to make projections for the indicators.
There are many potential statistical relationships, also called “model specifications” that could be used to predict labour market indicators. The key to obtaining accurate and unbiased estimates is to select the best model specification in each case. The ILO modelled estimates generally rely on a procedure called cross-validation, which is used to identify those models that minimize the expected error and variance of the estimation. This procedure involves repeatedly computing a number of candidate model specifications using random subsets of the data: the missing observations are predicted and the prediction error is calculated for each iteration.
This makes it possible to identify the statistical relationship that provides the best estimate of a given labour market indicator. It is worth noting that the most appropriate statistical relationship for this purpose could differ depending on the country.
Countries are asked to follow the guidelines of the International Standards of Labour Statistics when reporting their data. However, some countries choose to apply different definitions of the indicators when reporting data nationally and, in some of these cases, statisticians at the ILO process the micro data files to produce internationally comparable figures which differ from these national figures.
We are constantly improving the ILO modelled estimates. This usually happens for one of three reasons:
- Countries make new data available. The ILO’s labour statistics database is kept constantly up to date as new national labour force surveys are released. In some cases, this may only happen after a significant delay, requiring the ILO to replace its estimates for that year with the statistics reported.
- Revisions are made to other databases used by our statistical model. As mentioned above, our Trends Econometrics Models uses databases maintained by other international organizations such as the UN’s World Population Prospects and the IMF’s World Economic Outlook. These databases are periodically subject to their own revisions, which our model must take into account.
- Historical data needs to be revised. Periodically, data from prior years needs to be revised as new information emerges about it that can affect how ILO interprets that data in its model.
Estimates of labour market indicators
The ILO’s econometric models produce estimates of labour indictators to fill in missing values in the countries and years for which country-reported data are unavailable. For example, for unemployment rates, multivariate regressions are run separately for different regions in the world in which unemployment rates, broken down by age and sex (youth male, youth female, adult male, adult female), are regressed on GDP growth rates. Weights are used in the regressions to correct for biases that may result from the fact that countries that report unemployment rates tend to differ (in statistically important respects) from countries that do not report unemployment rates.1For instance, if simple averages of unemployment rates in reporting countries in a given region are used to estimate the unemployment rate in that region, and the countries that do not report unemployment rates differ from reporting countries with respect to unemployment rates, without such a correction mechanism the resulting estimated regional unemployment rate would be biased. The “weighted least squares” approach adopted in the ILO’s models corrects for this potential problem. For the current year, a preliminary estimate is produced, using quarterly and monthly information available up to the time of production of the estimates. The ILO estimates employment by status using similar techniques to impute missing values at the country level. In addition to GDP growth rates, the variables used as explanatory variables include the value added shares of the three broad sectors in GDP, per capita GDP and the share of people living in urban areas. Additional econometric models are used to produce global and regional estimates of working poverty and employment by economic class.2See Kapsos and Bourmpoula, 2013.
Youth labour market indicators
Labour market indicators for the sub-populations youth-female, youth-male, adult-female and adult-male have been estimated using the same regression techniques as the aggregate indicators. However, the estimates are adjusted using the shares in the population implied by the labour force survey estimates so that the implied sum of the sub-populations equals the aggregate rate. This means that country data on subpopulations could differ from reported rates in other sources when the underlying shares of the subpopulation in the labour force differ from the ILO’s estimates.
Short-term projection model
For a subset of countries, the preliminary unemployment estimate for the current year and the projection for the following year are based on results from a country-specific short-term projection model. The ILO maintains a database on monthly and quarterly unemployment flows that contains information on inflow and outflow rates of unemployment, estimated on the basis of unemployment by duration. A multitude of models are specified that either project the unemployment rate directly or determine both inflow and outflow rates, using combined forecast techniques.
Labour Force Estimates and Projections (LFEP)
The ILO programme on labour force estimates and projections is part of a larger international effort on demographic estimates and projections to which several UN agencies contribute. Estimates and projections of the total population and its components by sex and age group are produced by the UN Population Division, the employed, unemployed and related populations by the ILO, the agricultural population by FAO and the school attending population by UNESCO.
The main objective of the ILO programme is to provide member States, international agencies and the public at large with the most comprehensive, detailed and comparable estimates and projections of the labour force for countries and territories, the world as a whole and its main geographical regions.
Labour underutilization and employment structure
The ILO modelled estimates include multiple indicators related to labour underutilization and employment structure. To measure labour underutilization, unemployment disaggregated by sex and age is available. Additionally, with the same disaggregation the labour underutilization rates (LU2, LU3 and LU4), the NEET rate (youth not in employment, education or training), alongside time-related underemployment and potential labour force are estimated. Similarly, to analyse the employment structure of the countries included in the ILO Modelled Estimates the distribution of employment as a function of four different indicators is estimated. These indicators are: employment status, economic activity (sector), occupation, and economic class (working poverty). Finally, for all the indicators of labour underutilization and employment structure – except for economic class – a breakdown by geographical area (urban and rural) is produced.
The number of working hours lost is estimated by making use of a nowcasting model. This method uses data that are available almost in real time to predict aggregate hours worked that are published with substantial delay. The nowcasting model allows to produce the following indicators:
- Percentage of hours lost due to the COVID-19 crisis, compared to the baseline (the latest pre-crisis quarter, i.e., the 4th quarter of 2019, seasonally adjusted). Hence, the figures reported should not be interpreted as a quarterly or an inter-annual growth rate. The first year with available estimates is 2020.
- Full-time equivalent employment losses (assuming 40 or 48 workweek hours). This measure is constructed by dividing the number of weekly hours lost due to COVID-19 and dividing them by 40 or 48. Hence, they provide an illustration of the magnitude in hours lost, by expressing them in full-time jobs. The first year with available estimates is 2020.
- Total weekly hours worked by employed persons and weekly hours worked divided by population 15-64. These time series start in 2005, because the estimates combine nowcast results for 2020 with historical time series data on hours worked and population from ILOSTAT.
The data in the nowcasting model include a variety of indicators of economic activity and of the evolution of the labour market, such as:
- labour force survey data
- administrative data on the labour market, such as registered unemployment
- up-to-date mobile phone data from Google Mobility Reports
- Oxford’s COVID‑19 Government Response Stringency Index
- data on the incidence of COVID-19
Given the exceptional situation, including the scarcity of relevant data, the estimates are subject to a substantial amount of uncertainty. The unprecedented labour market shock created by the COVID-19 pandemic is difficult to assess by benchmarking against historical data. Furthermore, at the time of estimation, consistent time series of readily available and timely high-frequency indicators are still relatively scarce. These limitations result in a high overall degree of uncertainty. For these reasons, the estimates are subject to regular updates and revision.
For more information on the modelling technique, refer to the annexes of ILO Monitor: COVID-19 and the world of work.
Labour income share and distribution
The Labour Income Share and Distribution dataset covers 189 countries as well as global and regional aggregates. The data are based on the ILO Harmonized Microdata collection. In order to produce consistent time series for all countries, statistical models are used to extrapolate and impute missing data points. The dataset contains two key indicators: the labour income share and the labour income distribution, following the recommendation of the ILO Global Commission on the Future of Work to develop new distributional indicators. Furthermore, the new internationally comparable labour share data will be used to monitor progress towards the United Nations’ Sustainable Development Goals.
Wage growth rates
The methodology to estimate global and regional wage trends was developed by the ILO for the previous editions of the Global Wage Report (GWR) in collaboration between technical departments and the Department of Statistics, following four peer reviews conducted by five independent experts. The appendix of the GWR describes the methodology adopted as a result of this process.
In 2015, the ILO developed a methodology for generating global and regional estimates of international migrant workers and issued the first edition of ILO global estimates on migrant workers: Results and methodology, including global and regional estimates of international migrant workers and international migrant domestic workers, with reference year 2013.