1 Introduction

Although an emerging body of literature examines the determinants of self-employment and the push-pull aspects of self-employment choice in developing countries (Tamvada 2021; Audretsch et al., 2013; Kim & Cho, 2009; Earle & Sakova, 2000), little is known about the dynamics of self-employment in such contexts. Beginning with Kuznets (1966), a compelling body of theoretical literature suggests that self-employment and stages of economic development are related inversely. Lucas (1978) predicts that entrepreneurship decreases with economic development. Iyigun and Owen (1999) claim that as an economy develops, individuals invest time in accumulating professional skills through education than accumulating entrepreneurial human capital. Several empirical studies based on aggregated databases and cross-national studies confirm a negative association between the rate of self-employment and the stage of economic development (Pietrobelli et al., 2004; Folster 2002; Acs et al., 1994).Footnote 1

However, analyses based on aggregate data provide a partial picture of the dynamics and distributional changes in self-employment choice. They do not capture the inherent dynamics which, in Indian context for instance, are driven by both the momentum of growth and a historically driven caste divide. There is a visible interplay of both social and economic dynamics in the entrepreneurial choice functions of individuals through self-employment, but micro analyses of self-employment dynamics in developing countries are sparse in the extant literature mainly because of the non-availability of panel data.Footnote 2 For overcoming this challenge, the paper offers a novel empirical approach to examining self-employment dynamics by constructing pseudo panels of cohorts of individuals and tracking them over time. These pseudo panels are constructed using repeated cross-sectional individual-level microdata (Deaton, 1985; Verbeek, 2006), and provide new evidence on the dynamics of self-employment choice using individual-level microdata from India.Footnote 3

The paper aims to examine the following: (1) The role of university education for self-employment over time in a developing country, (2) the moderating role of development on the impact of education on self-employment over time, (3) the role of social class for self-employment over timeFootnote 4 and (4) the moderating role of education on the impact of social class on self-employment over time. In addition to the above, the paper examines how these effects vary across (a) non-agriculture and agriculture sectors and (b) rural and urban areas.

This paper makes several novel contributions to literature on occupational choice. Firstly, it offers novel insights on the role of social class for self-employment choice along different stages of economic development, a topic that that has rarely received attention in the literature. In particular, it advances the seminal contribution of Audretsch et al. (2013) by demonstrating that higher education moderates the negative impact of social class on self-employment choice over time. Secondly, it empirically tests the theoretical claim of Iyigun and Owen (1999) that in the early stages of economic development individuals invest in professional human capital and not in entrepreneurial abilities. Thirdly, it contributes to an emerging body of literature on self-employment in developing countries (Tamvada 2021; Tamvada 2015; Audretsch et al., 2013; Mohapatra et al., 2007) by examining the dynamics of self-employment activity over time. It offers a unique approach to examine the dynamics of self-employment using pseudo panels to overcome limitations posed by the non-availability of panel data.

The pseudo-panels are constructed from three cross-sectional databases collected by India’s National Sample Survey Organisation (NSSO) during 1994–1995, 1999–2000, and 2004. The paper provides first insights into the role of social class for self-employment over time while providing empirical support to a compelling theoretical literature on the evolution of self-employment. The results partially support the claim of Iyigun and Owen (1999) that as an economy develops individuals prefer to invest in professional human capital instead of entrepreneurial human capital. The results suggest that individuals who acquire education and wealth are less likely to transition into self-employment over time in non-agriculture while similar individuals in agriculture are more likely to transition into self-employment over time. Furthermore, backward class individuals are less likely to transition into self-employment in both the sectors but education moderates this relationship. Although University education decreases the likelihood of self-employment over time, it increases self-employment in the most developed cohorts.

The remainder of the article is structured as follows. Section 2 presents the theoretical background for the study. In Section 3, the dataset and descriptive statistics are presented. In the fourth section, the empirical results relating to self-employment choice and the role of social class over time are discussed. The limitations and conclusions are presented in the final section.

2 Human capital, social identities, and self-employment over economic development stages

Amongst early researchers, Lucas (1978) predicts that entrepreneurship decreases with economic development. Iyigun and Owen (1999, pp. 213–215) suggest that “entrepreneurial human capital plays an important role in intermediate income countries, whereas professional human capital is relatively more important in richer economies.” Under the assumption that entrepreneurship is riskier than providing professional services they show that individuals begin to invest more time in accumulating professional skills by way of education than in accumulating entrepreneurial human capital as an economy develops. In their words, “[a]s per capita income grows and the payoff to being a professional increases, individuals are less willing to gamble on entrepreneurial ventures … as the return to the safe activity increases and the payoffs to the risky activity becomes more variable, human capital accumulators devote more time to schooling and less time to gaining entrepreneurial experience.” This claim is tested here by tracking the occupational behaviour of individuals with greater levels of human capital over time.

Furthermore, irrespective of the social apparatus an individual faces, it is rational to assume that inherent in human nature is a strong desire to grow and prosper—preferably through less risk and greater stability. As an illustration, consider two states of economic environment, S1 (a sustained period of economic growth volatility) and S2 (a prospective period of stability and high growth momentum). Leaving aside any complex interplay of social dynamics, individuals may wish to experiment with a new business idea because of persistent unemployment and laggard growth during the initial periods of development. Self-employment is risky as the probability of a failure in the new venture is greater than the probability of success. However, in the absence of any certain and constant economic opportunity, individuals are more inclined to be self-employed rather than loosing time by a constant employment search if the probability of success in employment search is assumedly less than the probability of success in self-employment. As the economy progresses towards stability and transitions into S2, they are more likely to be inclined towards securing a stable job, for which they will be ready to devote time and resources for professional human capital development. Thus, investment in entrepreneurial human capital, a conventionally a risky investment strategy for individuals, is likely to be less preferred when compared to undertaking training for professional human capital and subsequent stable income generation as economy transitions into S2.

Educated individuals have superior opportunities in salaried employment yet they are more likely to be able to identify entrepreneurial opportunities. The empirical studies examining the role of education for self-employment offer support for both views and are inconclusive (Van der Sluis et al., 2005; Van der Sluis et al., 2008). It is likely that the U-shaped relationship between education and self-employment (Åstebro et al., 2011; Blanchflower, 2000) can explain these diverse effects. Both low levels of education and high levels of education can increase self-employment. Individuals with low levels of education are more likely to be pushed into self-employment while individuals with high levels of education are likely to be pulled into it. Education and human capital are linked to entrepreneurial success (Kolstad and Wiig, 2015; Unger et al., 2011) and returns to entrepreneurial activity (Van Praag et al., 2013; Hamilton, 2000). While pull factors such as high returns to capital can motivate individuals to enter self-employment and entrepreneurship, factors such as barriers to desirable salaried jobs can push individuals into self-employment as an occupation of last resort (Earle & Sakova, 2000). There is significant heterogeneity in informal enterprises in developing countries with the informal sector consisting of both upper-tier and lower-tier enterprises (Cunningham & Maloney, 2001). The upper-tier consists of competitive enterprises while the lower-tier consists of enterprises created by individuals rationed out of formal labour market (Fields, 2004). While in some other countries like Cote-d’Ivoire the informal sector consists of both voluntary and involuntary segments (Günther & Launov, 2012), there is increasing evidence that in some developing countries such as Ghana pull factors are attracting skilled workers into self-employment (Falco & Haywood, 2016). In more developed countries, self-employed individuals are likely to be more educated than self-employed individuals in less developed countries (van Stel & van der Zwan, 2020).

Some studies highlight that own-account self-employed individuals have characteristics that are consistent with the dualistic labour market hypothesis while self-employed individuals with employees resemble opportunity entrepreneurs of industrialized countries (Mandelman & Montes-Rojas, 2009). Tamvada (2010) examines the heterogeneity in the labour market in the Indian context and finds a hierarchy in returns to occupations with employers having greater returns than salaried individuals who in turn have greater returns than self-employed individuals across the welfare distribution. Using income data, Gindling and Newhouse (2014) examine the heterogeneity in self-employment in 74 developing countries, and find that occupations have a pecking order similar to Tamvada (2010). They further find that as the economy develops individuals move from agriculture into unsuccessful non-agricultural self-employment, and from there into wage employment in non-agriculture or successful self-employment. Furthermore, a significant proportion unsuccessful self-employed individuals have characteristics similar to successful self-employed individuals suggesting that they experience barriers to their growth.

Margolis (2014) finds that social protection systems allow individuals to avoid subsistence self-employment while labour market frictions increase self-employment rates. He posits that about one-third self-employed individuals in the developing world are opportunity entrepreneurs who take initiative and risk. He suggests that self-employment rates decrease as countries develop and their institutions evolve, and wage employment becomes the main source of jobs. While self-employment is known to decline with economic development (Kuznets, 1966), it may increase in later stages of development, as in the case of OECD countries (Acs et al., 1994). Pietrobelli et al. (2004) examine data from 64 developing countries and 19 developed countries over 30 years to confirm the Kuznets hypothesis that self-employment tends to decline with economic development. However, they find that it can represent emerging opportunity entrepreneurship that boosts the development process in some contexts. Consistent with this, Mohapatra et al. (2007) find that self-employment in rural China has characteristics that are similar with productive small business sector. Wennekers et al. (2002) examine self-employment data between 1899–1997 and find that self-employment rates continuously declined until about early 1980s in Netherlands before experiencing a revival in the following period. Fritsch et al. (2015) find that self-employment has increased in Germany following the German reunification, and demographic developments, shifts to service sector employment, and increased university education have contributed to the increases in self-employment.

With respect to the role of education on self-employment, there is little consensus in the extant empirical literature (Premand et al., 2016; Van der Sluis et al., 2005, 2008).Footnote 5 While education expands an individual’s knowledge base and increases exposure to new opportunities, it also increases the opportunity cost of being self-employed. This suggests that returns to salaried employment increase faster than returns to entrepreneurship as per capita income grows with the result being that individuals have “more to lose” by engaging in entrepreneurship (Lucas, 1978) but may eventually come back to self-employment when they are likely to get higher returns from self-employment during later stages of development. Thus, there are compelling reasons to posit that individuals who are more educated will opt for salaried employment instead of self-employment over time. These lead to the following hypotheses:

Hypothesis 1a: :

University education decreases the likelihood of transitioning into self-employment over time.

Hypothesis 1b: :

Economic development positively moderates the role of university education for self-employment over time.

The potential role of social identity for self-employment choice dynamics is an open-ended question. Concomitantly, entrepreneurship, for instance, a start-up type enterprise in a small scale, requires a definite clarity of success in the venture, especially when the economy is at a taking-off stage. Yet, questions may arise on the role of social gradients that may limit entrepreneurial ambition and the propagation of self-employment. Entrepreneurship literature has noted that occupational choice is influenced by individuals’ social identities (Audretsch et al., 2013).

An individual’s self-worth is influenced by their location in the social hierarchy (Foucault, 1982; Bourdieu, 1987; Goel and Deshpande, 2016). In line with the social dominance theory, individuals lower in the social hierarchy are likely to be dominated by those in the upper in the hierarchy (Sidanius et al., 1992; Sidanius et al., 1994). In such cases, symbolic interactionism theory suggests that the social interactions between individuals will influence their behavioural choices (Mead, 1934; Blumer, 1986). Consistent with this theory, individuals from socially backward classes may find their social identity to be a limiting factor for their entrepreneurial choice. For these reasons, the caste system in India, as a historically persistent phenomenon, has multi-layered implications for employment choice along with regional and educational opportunities. It is a rigid form of the social class structure that does not depend on the colour of the skin (Deshpande, 2005) but shares many qualities of social stratification found in other parts of the world (Donoghue, 1957; Berreman, 1960).

The caste divide in India is a cultural and social phenomenon that has been a part of the traditions of the Indian society for thousands of years. In the sociological discourse on the Indian caste system, some groups are referred to as lower castes or backward classes (Revankar, 1971; Srinivas & et al., 1962). In line with the social dominance theory (Sidanius et al., 1992; Sidanius et al., 1994; Sidanius & Pratto, 2001), if backward class individuals believe that it is socially unacceptable for them to start businesses, they may not enter self-employment leading to caste-based differences in business ownership.

Furthermore, this may lead to strong social boundaries (Kim & Aldrich, 2005) that limit access to resources. The institutional profile associated with the caste system may not support individuals from backward classes becoming self-employed (Audretsch et al., 2013), and may limit access to finance and information (Bönte & Filipiak, 2012). Gupta et al. (2018), amongst others, study the extent caste systems impacts individuals’ social identity and their consumption levels. Many authors also note that it is a major obstacle to achieving development goals because affected populations can get excluded from the development process. For instance, using NSS data for the periods 1983–1999, Kijima (2006) has reported that the disparities of living standards amongst marginalized backward social classes when compared to the non-marginalied backward social classes still remain very high. Despite significant governmental efforts since independence, these marginalized groups have low occupational mobility and are engaged mostly as agricultural labourers or in self-employed agricultural work (Gang et al., 2017) suggesting significant social inequality between high and low castes. As a consequence, individuals from backward social classes have lower mobility in the labour market and report higher poverty level and income inequality (Desai and Dubey, 2012).

Using social distance theory, Gupta et al. (2018) argue that social identity plays a role in individual alienation. However, it is well-known that the years after liberalization in early nineties have unleashed many economic opportunities in the Indian economy, and there are compelling reasons to assume that the dynamic economic environment is influencing the occupational behaviour of people. Individuals from the deprived social class may seek professional human capital in times of rapid economic growth, rather than devote time to building entrepreneurial human capital that needs stronger social networks, amongst others, to be in place. However, they are likely to be less entrenched by the caste system over time as the tacit restrictions on occupational choice may become less binding as they gain more education. These lead to the following hypotheses:

Hypothesis 2a: :

As an economy develops, individuals belonging to backward social classes are more likely to become wage employees.

Hypothesis 2b: :

As an economy develops, education moderates the effect of social class, and educated individuals belonging to backward social classes are less likely to be inhibited from entering into self-employment.

3 Data construction, characteristics and estimation issues

3.1 Pseudo-panel data construction

As the objective is to characterise self-employment choice along the development path of an economy, conditional on existing social dynamics such as prevalence of caste hierarchy, creating a panel data by tracking individuals over a period of time is needed. However, the existing employment-unemployment data of the National Sample Survey of India (NSSO) does not report such information based on which an individual’s employment status can be tracked. Panel data, in general, have limits in terms of availability over time and attrition. In this situation, a “pseudo-panel” based on cohorts that are stable groups of individuals, rather than individuals over time, can shed light on these dynamics. Here, individual variables are replaced by their intra-cohort means. As the database is in the form of independent repeated cross sections, pseudo-panels are constructed as an alternative to panel data for estimating the empirical models.

This method is helpful because the linear model with individual fixed effect corresponds to its pseudo-panel data counterpart due to the linearity of this transformation when using intra-cohort means. The individual fixed effect is replaced by a cohort effect and the model is particularly simple to estimate if the cohort effect can itself be considered as a fixed effect. For these reasons, the criteria for forming the cohorts must be taken into account like, for example, being observable for all the individuals, the ability to form a partition of the population (each individual is classified into exactly one cohort), and corresponding to a characteristic of the individuals that will usually not change over time like gender. Furthermore, the size of the cohorts depends on the trade-off between bias and variance. It must be large enough to limit the extent of measurement error on intra-cohort variable means that leads to bias and imprecise estimators of the model parameters.

3.1.1 The architecture of pseudo-panels

As Deaton (1985) argues, in the absence of genuine panel data, repeated cross-sectional data can be used to construct synthetic or pseudo-panels. A pseudo-panel based on, for instance, age cohorts, gender, or education levels can be used to control for at least cohort fixed level effects. Such methods are similar to instrumental variable methods where group dummies are used as instruments.Footnote 6 To explicate, consider the following linear model with individual effects,

$$ y_{it}=x_{it}\beta+\alpha_{i}+e_{it}, $$
(1)

for i=\(1,\dots ,N\) and t = 1,…,T.

For simplicity, we assume that observations on N individuals are available for all the time periods. When the individual fixed effects αi are uncorrelated with xit, it is possible to pool the cross sections to consistently estimate the regression coefficients β. In most situations, the correlation between the individual effects and some of the explanatory variables implies that the K moment condition given by E{(yitxitβ)xit} = 0 is violated, in which case the cross sections are not poolable. In case the data is genuine panel data, the fixed effects approach can be used to treat αis as unknown fixed parameters. However, if the data on the same individual are not available for each year, this cannot be used.

Following Deaton (1985), the observations are aggregated to cohort levels, where cohorts represent people of similar characteristics. In this case, the model assumes the following form,

$$ \overline{ y}_{ct}=\overline{x}_{ct}\beta+\overline{\alpha}_{c}+\overline{e}_{ct}, $$
(2)

for c = 1…C and t = 1…T, where the variables are aggregated to cohort level averages. This pseudo-panel, however, does not allow consistent estimation of β as \(\overline {\alpha }_{ct}\) is likely to be correlated with the \(\overline {x}_{ct}\). Under an assumption that \(\overline {\alpha }_{ct}\) is a term fixed over time, the above equation can be consistently estimated. This is very likely to be the case when the average cohort size, \(n_{c}\rightarrow \infty \). In such a case, the natural estimator for β is the within estimator given by,

$$ \begin{array}{@{}rcl@{}} \hat{\beta}_{W}&=&\left( \sum\limits_{c=1}^{C}\sum\limits_{t=1}^{T}\left( \overline{x}_{ct}-\overline{x}_{c}\right)\left( \overline{x}_{ct}-\overline{x}_{c}\right)^{\prime}\right)^{-1}\\&&\sum\limits_{c=1}^{C}\sum\limits_{t=1}^{T}(\overline{x}_{ct}-\overline{x}_{c})(\overline{y}_{ct}-\overline{y}_{c}) \end{array} $$
(3)

As described in Verbeek (2006), the asymptotic behaviour of pseudo-panel estimators can be derived for the following alternative asymptotic sequences. First, when \(N\rightarrow \infty \), with C fixed, so that \(n_{c}\rightarrow \infty \). Second, when \(N\rightarrow \infty \) and \(C\rightarrow \infty \), with nc fixed. Third, \(T\rightarrow \infty \), with N, C and nc fixed. While Moffitt (1993) and Verbeek and Vella (2005) employ the asymptotics of the first type, Deaton (1985) and Verbeek and Nijman (1993) employ the second type.

In this paper, we also assume asymptotics of the first type. In this case, the fixed effects estimator is consistent estimator for β, when

$$ plim \frac{1}{CT} {\Sigma} {\Sigma} (\overline{x}_{ct}-\overline{x}_{c})(\overline{x}_{ct}-\overline{x}_{c})^{\prime} $$
(4)

is finite and invertible and

$$ plim \frac{1}{CT} {\Sigma} {\Sigma} (\overline{x}_{ct}-\overline{x}_{c})\alpha_{ct}=0 $$
(5)

As \(n_{c}\rightarrow \infty \) the above conditions are automatically satisfied as the cohort fixed effects converge to a constant over time, that is, \(\alpha _{ct}\rightarrow \alpha _{c}\) (Moffitt, 1993).Footnote 7Deaton (1985) relies on asymptotics of the first type and does away with the necessity to have large numbers of observations in each cohort. This is achieved by considering the cohort averages as error-ridden measurements of the population averages of the cohorts. By assuming that measurement errors are distributed with zero mean, the moment matrices of the within estimator are adjusted to correct for the measurement error.Footnote 8

3.2 Estimation and selection issues

When the cohort selection criterion has the qualities required to consider the pseduo panel estimation as a fixed effects model, the parameters are generally estimated based on standard panel data estimation techniques. A within transformation is applied as in standard panel data estimation in which, for each cohort, the various variables are centred on the mean of the observed values for the cohort, for all the observation dates (in this case, years 1994, 2000 and 2004). This allows the estimation of the cohort effect estimator.

In the estimations, there may be some endogeneity or self-selection issues, particularly with respect to wages and assets. The empirical strategy addresses this problem by introducing an extensive set of land controls in the estimation. Thus, the panel estimation results can also be viewed as conditional on these controls. Due to the problem of non-availability of panel data, we have constructed pseudo panels for tracking self-employment over time. Our sample has three time periods (three survey rounds), and therefore application of system GMM or Arellano-Bond type dynamic panel estimation to tackle potential endogeneity in the estimation is not achievable. These estimations require deeper lags of the difference of both predictors and dependent variable but we use the land ownership variables to closely approximate assets.

3.3 Data source and distributional characteristics

For the pseudo-panel analysis, the 50th round (collected during 1994–1995), the 55th round (collected during 1999–2000) and 61st round employment-unemployment surveys (collected during 2004–2005) of the National Sample Survey Organisation of India (NSSO) are used. The sample is restricted to those who are older than 15 years but younger than 70 years. Family members who assist household enterprises, children and the elderly, and people classified into other miscellaneous occupational categories are excluded from the analyses. Individuals who have reported their principal economic activity to be self-employment (including own-account workers and employers), salaried employment, casual labor, or unemployment are included in the sample.Footnote 9

Table 1 presents the summary statistics from the three cross-sectional databases. As given in this Table, for the years 1994–1995, the sample consists of 164,543 individuals. For the years 1999–2000, it consists of 171,361 individuals, and for the years 2004–2005, the data are available for 170,281 individuals. The table suggests that 40.4% of individuals are self-employed in 1994–1995. This rises up to 44.6% in 2000 and 53% in 2005.

Table 1 Descriptive statistics (full sample)

Pseudo-panels are constructed using these cross-sectional databases from 1994–1995, 1999–2000, and 2004–2005 by constructing cohorts of individuals based on 5 year age bands interacted with the state of their residence. For the asymptotic reasons described earlier, only those cohorts that have at least 500 observations in each of the surveys are considered for the analysis.Footnote 10 Table 2 presents the summary statistics of the cohorts of the pseudo-panel that are constructed from the cross-sectional databases. The estimations use these variables reported in Table 2.

Table 2 Descriptive statistics (cohorts)

Figure 1 reports the adaptive Kernel density plots for the dependent variable, the proportion of self-employed individuals in cohorts, for all the three periods. The distributions do not appear to be strictly unimodal as both left and right hand side of the distributions contain significant number of observations implying the existence of some clusters. Given the large number of cross-sectional observations, it may look like the distributions have equal mean, median and mode central values.

Fig. 1
figure 1

Distribution of self-employment in cohorts. (a) 1994–1995. (b) 1999–2000. (c) 2004–2005

Figure 2 plots bivariate Kernel density with contour surface plot for the years 1995, 2000 and 2005 of self-employment and university education in the cohorts. These offer a visual impression of the surface of the joint density thus providing a clearer impression of relative heights and the concentration of the observations in the database. Figure 3 illustrates the surface of the joint density of proportion of self-employed individuals and social class. The expanding contours show relative spreads, that the principal economic activities are distributed heterogeneously across the distribution of castes. They also show some fixed patterns of red contour areas with a significant degree of association between these two variables.

Fig. 2
figure 2

Bivariate distribution of self-employment and university education. (a) 1994–1995. (b) 1999–2000. (c) 2004–2005

Fig. 3
figure 3

Bivariate distribution of self-employment and social class. (a) 1994–1995. (b) 1999–2000. (c) 2004–2005

A number of other factors influence occupational choice, and these are introduced as controls in the estimations. In particular, wealthier individuals have more of a “safety net” when embarking on a new venture than their less wealthy counterparts. Wealth itself can make financing self-employment possible, but it also makes it easier to obtain credit. Households with very high levels of wealth have a higher propensity to take risk (Carroll, 2000). Blanchflower and Oswald (1998) find that inheritance increases the probability of self-employment. Banerjee and Neuman (1993) argue that wealth distribution determines occupational structure. For these reasons, wealthy individuals are more likely to enter self-employment over time, and we control for these effects by introducing the land possessed variables. Furthermore, most empirical evidence suggests a positive relationship between age and entrepreneurship (Evans & Leighton, 1989a; Blanchflower & Meyer, 1994; Blanchflower, 2000), and married individuals are more likely to be self-employed than are their nonmarried counterparts (Borjas, 1986; Blanchflower & Oswald, 1998).Footnote 11 These variables are also introduced in the estimations as controls.

4 Empirical results

The empirical results are presented in the following order. Firstly, the results of the pseudo-panel regressions for all the cohorts are presented in Table 3. Following this the pseudo-panel regression results for the non-agricultural and agricultural sectors are presented in Table 4, and the results for rural and urban areas are presented in Table 5.

Table 3 Dynamics of self-employment, social class and education over development (pseudo-panel estimations)
Table 4 Dynamics of self-employment, social class and education over development in non-agriculture and agriculture (pseudo-panel estimations)
Table 5 Dynamics of self-employment in rural and urban areas (pseudo-panel estimations)

The first column of Table 3 presents the base specification of the empirical model for all cohorts.Footnote 12 The results suggest that increases in high school education in a cohort increase self-employment in the cohort but University education reduces it. As the negative and significant coefficient of the Backward Class variable suggests, an increase in the proportion of individuals belonging to socially backward classes in a cohort decreases the proportion of self-employed individuals in the cohort. Column 2 of Table 3 introduces the interaction between the Backward Class and University variables into the estimation but this interaction term is insignificant here.

In column 3, the results of the two staged estimation are presented. In the first stage, a measure of development (measured by log of mean per capita consumption) is regressed on the other variables. Following this, the predicted development measure is interacted with the University education variable and introduced in the estimation. This approach helps resolve a part of the correlation between the development indicator with other predictors, and allows us to examine how development moderates the effect education on self-employment. The results in column 3 suggest that the predicted \(\widehat {Develop}\) variable is significant and negative suggesting that the process of development gets individuals out of self-employment. However, the positive and significant interaction between \(\widehat {Develop}\) and University suggests that in the most developed cohorts, University education has a positive impact on self-employment in the cohorts over time. Thus, the results support hypotheses 1(a), 1(b) and 2(a).

As these results consider agricultural and non-agricultural individuals together in the cohorts, the estimated effects are net effects of both the sectors. If the effects are distinct across these two sectors, the net effects are likely to cancel out these distinct effects. For this reason, in Table 4, the models are re-estimated separately for cohorts of individuals in non-agriculture in columns 1–3 and cohorts of individuals in agriculture in columns 4–6. The results strongly support the view that education, social class and development all have has a significantly negative impact on self-employment in non-agriculture. However, the interaction term of Backward Class and University is positive and significant in columns 2 and 3 suggesting that University education enables socially backward class individuals to overcome the barriers to self-employment. Similarly, the coefficient of the interaction between predicted \(\widehat {Develop}\) variable and University is positive suggesting that in the most developed cohorts, individuals with University education are more likely to enter self-employment. Thus the empirical results presented in columns 1–3 of Table 4 strongly support hypotheses 1(a), 1(b), 2(a) and 2(b).

In contrast to the non-agriculture sector, University education has no significant effect in the agriculture sector although increases in high school education have a positive effect on self-employment in this sector. However, the effect of belonging to backward class and the moderating role of university education are similar to the effects in non-agriculture. Thus, the inhibiting effects of caste system continue to persist in the agriculture sector but University education allows backward class individuals to overcome the social barriers to self-employment. In contrast to the non-agriculture sector, the positive and significant coefficient of the predicted \(\widehat {Develop}\) variable suggests that the process of development increases self-employment in agriculture. Thus, the empirical results presented in columns 4–6 of Table 4 for agriculture sector do not support hypotheses 1(a) and 1(b) but support 2(a) and 2(b).

In summary, the results in Table 4 suggest that belonging to a socially backward class has, on an average, a negative effect on self-employment choice; however, these effects are moderated by University education. In both non-agriculture and agriculture sectors, the interaction of higher education with social class produces a positive effect on self-employment choice suggesting that higher education ameliorates the negative impact of social class on self-employment. Thus, education enables individuals to overcome the limitations posed by the caste system, and the results offer significant support to hypotheses 2 (a) and 2 (b) in both sectors.

For the control variables in the non-agriculture equations in column 1–3 of Table 4, the coefficients are positive and significant in the midsized land ownership category and negative for the largest land size variable. In contrast to this, in the agriculture equations in columns 4–6 of Table 4, the largest land ownership variables are positive and significant. Thus, small amounts of land enable individuals to enter self-employment in non-agriculture, while individuals with large amounts of land choose self-employment in agriculture. Furthermore, the urban variable is significant and negative in nongariculture equation in columns 1–3 but insignificant in columns 4–6. This result is expected as one can expect an increase in the share of urban population in the cohort to have a positive influence on self-employment as rural-urban migration is likely to increase the share of people working as salaried employees in non-agriculture. The effects of the age and gender are consistent with extant literature.

For a more nuanced picture of these effects in rural and urban areas, the models are re-estimated for rural and urban areas separately in Table 5. While most individuals working in agriculture live in rural ares, individuals working in non-agriculture are found in both urban as well as rural areas. As these areas are likely to have distinct effects on self-employment, the estimation results on pseudo-panels of cohorts of individuals in rural areas and urban areas in the non-agriculture sector are presented in Table 5. The results for rural cohorts are presented in columns 1–3 and urban cohorts in columns 4–6.

In both rural and urban areas, the results suggest that increases in University education over time have a significant negative effect on self-employment in cohorts demonstrating that educated individuals become less likely to be self-employed.Footnote 13 Thus, in both rural as well as urban areas, individuals who acquire higher human capital are more likely to move out of self-employment over time. The coefficient of the backward class variable is significantly negative in columns 1 and 4 of Table 5 suggesting that an increase in the proportion of the socially backward class people in a cohort reduces self-employment in it. The interaction variables of Backward Class with University are insignificant in columns 2 and 3 and 5 and 6. It is likely that these insignificant coefficients are because of the smaller sample number of cohorts in the estimated models and the smaller number of individuals within the cohorts.

The predicted \(\widehat {Develop}\) variable has a significantly negative coefficient in columns 3 and 6 suggesting that in both rural as well as urban areas the process of development decreases self-employment. However, in column 6 the interaction between the predicted \(\widehat {Develop}\) variable and Prop. University is positive and significant suggesting that in the most developed urban areas, highly educated individuals are more likely to enter self-employment over time.

In A Table 6, we introduce more education variables to test for the robustness of the empirical results presented in Tables 3 and 4. The results are entirely consistent with the findings presented above. In A Table, as endogeneity can result from a reverse causality in the relationships, it is examined by regressing land ownership on self-employment. In general, we find that self-employment does not statistically significantly impact land ownership in the majority of specifications. Hence, the proposed direction of causality—land ownership impacting self-employment choice can be considered valid.

5 Discussion and conclusion

Extant literature suggests that the evolution of an individual’s choice function is conditional on development stages of an economy: from an unstable and regressive growth path to a path of progression and sustainability (Lucas, 1978; Iyigun and Owen, 1999). Focusing mainly on the industrialized nations and aggregate data, empirical literature has showed that there is a long-term decline in self-employment in the 1970s and an observed trend of revival since 1970s. Taking the case of OECD countries, for example, Bogenhold and Staber (1991) argue that rise in self-employment in these countries since 1970s may be “a response to deficiencies in labour markets rather than a sign of economic vitality”. However, such an inference is problematic for a fast-growing developing economy like India. This article is amongst the sparse body of literature in the developing country context examining the dynamics of self-employment choice over time using microdata.

Despite significant deficiencies in the labour market, self-employment and entrepreneurial choices in a developing country like India may be a sign of economic vitality. A certain sectoral bias while interacting with individuals’ social identities may form a natural bound for self-employment growth and entrepreneurial choice. However, educational attainments increasingly offset negative effects of caste-based social identities. This article studies and establishes these predictions by using pseudo-panel techniques to analyse the dynamics of self-employment in India. Using three different cross-sectional databases collected over 1994–2004, the pseudo-panel analysis tracks the dynamics of self-employment in India. The results suggest that in both agricultural and non-agricultural sectors, individuals from socially backward classes are less likely to enter into self-employment over time. However, University education moderates this effect of social class on self-employment.

Furthermore, the results support the claim that as an economy develops individuals prefer to invest in professional human capital instead of entrepreneurial human capital. Consistent with the theoretical predictions, the results suggest that individuals in non-agriculture who acquire education and wealth are likely to move out of self-employment over time. However, individuals in agriculture who acquire education and wealth are more likely to enter self-employment over time.

By examining the dynamics of self-employment and the characteristics of individuals who are entering into it over time, we are able to indirectly examine the push-pull dynamics of self-employment. The inherent heterogeneity within the self-employment sector is evident as we find that University education and development processes decrease self-employment in the cohorts suggesting that necessity entrepreneurship reduces over the course of economic development. The positive and significant coefficient on the interaction term between the predicted development measure and University education suggests that individuals are able to pursue opportunity entrepreneurship in cohorts that are most developed. The results demonstrate that educated individuals are less likely to enter self-employment as the economy develops suggesting that they are less likely to be pushed into self-employment but in the most developed cohorts they are pulled into self-employment. Thus, we are able to indirectly identify the push-pull aspects of self-employment choice over the economic development trajectory.

The article has important policy implications. It suggests that public policy should focus on facilitating industrial development in the initial stages of development as individuals are motivated to invest in professional human capital during these periods while encouraging entrepreneurial activities when a more developed economy can provide opportunities for greater returns in self-employment. However, at all stages of economic development, policy needs to ensure that social barriers do not inhibit individuals from transitioning into self-employment and entrepreneurship.

Overall, the paper makes several novel contributions to the extant literature. In particular, it sheds fresh light on the role of social identity for self-employment along stages of development. Empirically, it offers partial support to the claim of Iyigun and Owen (1999) that as an economy develops individuals prefer to invest in professional human capital instead of entrepreneurial human capital. The article sheds fresh light on the self-employment process in India, a rapidly growing country that is trying to break social barriers to entrepreneurship.

Institutional theory (North, 1991; Scott, 1995; DiMaggio, 1998) has underscored the role of institutions in shaping economic behaviour of individuals. The institutional context may have a significant influence on entrepreneurial decision-making (Tolbert et al., 2011; Welter & Smallbone, 2011). For instance, following Mead (1934), Klyver et al. (2013) suggest that the gender gap in self-employment is attributable to their socially constructed institutional context. However, we do not pursue institutional aspects here as we restrict the focus to the individual determinants of self-employment to avoid introducing further complexity to the construction of the pseudo-panels and the empirical models. Furthermore, the limitations of the data do not allow us to examine the hybrid nature of the self-employment activity. These can be fruitful areas for future research examining the self-employment dynamics in developing countries. Following O’Connor (2013), future research can also examine the role of entrepreneurship education, rather than education per se, for self-employment choice.