The 1940 data set includes a lot of demographics in comparison to some other data sets; when looking at the data, the person’s name, age, race, address, marital status, education level and things of the sort can be found. The data set includes numeric, text and geographic information. The numeric text include age, estimated birth year, income, and the value of the person’s home; the text includes whether the person rented or owned their home, their relationship to other people in their home, gender, race, marital status, whether they attended high school or college, highest grade they completed, their employment status, birth place of their parents and their native language. In comparison to the numeric and text information given, the geographic information is not much, it includes the individual’s birth place, residence and street name. In each column that has numeric data, the data varies. For example, the column that pertains to the age of those listed in the data set ranges from the age of 1 to 87, the value of the homes range from 2o to 10,000, the income of each person varies between not having an income and making as much as 7,500 dollars. The geographic range of the data shows that many people lived around the same areas; although some of these people are members in a family, there is a significant amount that seems to have no relation to one another but live nearby. Most of the people lived in the downtown area of Albany; the locations ranged from Hamilton Avenue, Stanwix Street, Delaware Avenue, Barrow Street, Second Avenue and other neighboring places. The rows in the data set present us with a variety of information; the subheadings for the rows include race, address, age, employment status and other information and all of these subheadings either describe a person (the individual’s age, race, marital status etc.), a place (address) or a thing.
When the data set is looked at, there is information that does that need to be searched for because it is already given; some information however is not given and conclusions must be made based on the information that is. One comparison that can be made from the information provided is the relationship between gender and employment; upon initial assumption it may appear as though most of the women in the data set did not work and that proves to be true. Although most did not work, some did; some of the women in the data set were servants which would be considered employment and there were other women that received a high level of education which provided them with the knowledge and skills for an occupation. Another comparison that can be drawn from this data is whether the level of education that the person received plays a role in their occupation. Based on the data, those that received less than a high school education are mostly unemployed, those that have a high school education have jobs such as traveling salesman and electricians and the few that received a college education have an occupation as a lawyer or a librarian. A comparison can also be made between whether a home was owned and the gender of the person who owned it. Based on the data, most, if not all of the people that owned homes were male and were the head of their household; their wives were often unemployed and so were most of the daughters. There is a comparison that cannot be determined simply off of the information provided but the relationship between whether the area a person lived in affected their ability to own their home is one to look into; if there is a correlation between the area and home ownership that may help to explain why many people lived in similar areas even those that were not related to one another.