[ad_1]
At the outset, I completely agree with our former chief statistician Pronap Sen that “statisticians are not stupid”. However, one must humbly accept that statisticians can be wrong, and quite sadly, in their argument, so blatantly they are! The simple point of my initial article (“The Sample Is Wrong”, IE, July 7) was that surveys in India systematically underestimate the level of urbanization due to fundamental flaws in sampling methodology – thus providing us time and again with biased estimates for various indicators of interest. The former chief statistician published a scathing rebuttal (“Statistics are not stupid”, that is, July 10) in which he put forward two main points of criticism. Let me explain why both of his points are misleading. His first point of criticism is simply incorrect and the second is incomplete and therefore imprecise.
The main point of Sen’s article is that the National Sample Survey (NSS) and the Census differ in their estimates of the rural ratio because there are differences in the definitions of rural/urban between the Census and the NSS. This is strange. The comprehensive report of the NSS (Golden Jubilee Publication, 2001) details the concepts and definitions used in the surveys – the report is available on the MOSPI website. The report clearly defined rural and urban areas – which I report verbatim below – in favor of readers’ judgment.
“2.1.6. Rural and urban areas
The country’s rural and urban areas are taken as approved in the most recent population census for which required information is available with the Survey Design and Research Department of the NSSO. Lists of census villages as published in Preliminary Census Summaries (PCA) make up rural areas, and lists of cities, towns, camps, non-municipal urban areas, and reported areas make up urban areas.
2.1.6.1 Urban area
The urban area of the country was determined in the 1971 census as follows(s):
(a) All places in which there is a municipality, institution or assembly of places and places notified as a city district
(b) All other places that meet the following criteria:
(i) Minimum population of 5,000,
(ii) At least 75 percent of the working population is non-agricultural, and
(3) a population density of at least 1,000 per square mile (390 per square kilometre).
However, there are metropolitan areas that do not uniformly possess all of the above characteristics. Certain areas have been treated as urban on the basis of their possession of distinct urban characteristics, their overall importance and their contribution to the urban economy of the area.
The report convincingly shows that the NSS goes beyond regular cities (a) to include census cities (part b) in its definition of metropolitan areas. At no point does the NSS document say that the NSS definition of metropolitan areas is only census-legal cities. Furthermore, the NSS goes a step further and even includes those areas to be part of metropolitan areas that contribute significantly to the area’s urban economy—which goes beyond the definition of urban census.
Based on the evidence presented, it is clearly wrong for Sen to say, “…although all surveys use the census as the sampling frame, census towns are treated as part of the rural sector and are included in the rural sample.” My understanding of the rural/urban definition of NSS is from the publicly available document, and I’m not in a position to comment on documents that Sen might be privy to, in his former privileged location, which are not in the public domain.
Furthermore, when one carefully reads the notes on the sample design and estimation procedures of the PLFS, another major national survey, it states that “when details of the next census are available, the new frame will only be used when the urban frame survey is frozen for all cities.” The newly declared census and uniformed cities are available for sampling frame setting…”, which clearly indicates once again that the sample urban frame includes census cities and is not limited to regular cities. All of this information is publicly available on the MOSPI website. So it is somewhat surprising that the former chief statistician so confidently asserts otherwise.
Now to his second point of criticism regarding the response rate. Although Sen agrees that estimates will indeed be biased downward and more so in urban areas than in rural areas, he tries to dismiss it as a general problem and tries to reduce the problem by comparing the non-response rate in the United States (30 percent). ) to India 8 per cent. The devil, as always, hides in the details. One has to go beyond the average response rate and study the distribution. For example, while the overall nonresponse rate for NFHS-5 was only 5 percent (which is remarkable), it was 36 percent and 16 percent for men in Chandigarh and Delhi, respectively. Non-response is neither uniform nor random across regions. It is strongly and negatively associated with wealth, according to our analysis.
Data quality is a serious concern that needs dedicated and continuous efforts to improve. Our statistical systems will only make progress if we genuinely understand the problem at hand and address it with humility and transparency. Data systems are not designed to make governments look good or bad. They aim to provide an objective picture of the reality on the ground across the country to anyone interested – policymakers, scientists and citizens alike. The sampling methodology in our surveys requires an urgent upgrade to keep pace with the needs of India’s dynamic economy. “Statisticians are not stupid,” but they need to be open to improvement and innovation—otherwise they could be very wrong.
The writer is a member of the Economic Advisory Council to the Prime Minister of India
[ad_2]