Emerging Uses of Big Data in Immigration Research
This project seeks to understand the use, challenges, and opportunities of big data sets in immigration research. Big data are large sets of data whose size is beyond the ability of typical software tools to capture, store, manage, and analyze. The data sets come from government surveys, from geolocation like Facebook tags, and digital transaction like financial services. At the outset of this project, it is not clear how or if big data are being used by researchers to dig deeper and more comprehensively about the lives of immigrants. Such findings can inform immigration settlement services, business products and services offered to immigrants, along with government policies and programs.
A scoping review method isolated key themes embedded within a large body of immigration literature. The establish six-stage framework by Levac, Colquuhoum, and O’Brien (2010) enabled a rigorous approach that identified 453 papers, and after confirming high levels of inter-rater reliability, there were 251 immigration studies (1994-2016) used in this knowledge synthesis project. About 60% were journal articles, with another 20% as reports, and the balance from grey literature and theses. The papers examined five themes, yet dominated by labour market (57%) and social integration (25%) papers. The studies used 15 data bases, with 60% of them drawing from Statistics Canada and 65% used one data set and 15% utilized more than three sets.
Authors reported many challenges that were grouped under five topics, including sampling issues, low response rates, invalidated variables, access, and confidentiality. The authors identified opportunities as well, ranging from the need for diverse knowledge on data manipulation given the complexity and growing nature of some data sets, to design experiments on unprecedented scale, skills development of researchers to use data sets, linking data bases to examine different aspects of immigrants’ lives, and need to advance hardware and software along with tools and techniques of data mining.
Big data has great potential in immigration research. As mentioned earlier, geocoded data can help aid organizations identify where people in need might be located, but also what they might need! The Syrian Humanitarian Tracker has been used by thousands of internally displaced people and asylum seekers abroad to identify safe travel routes and to avoid human traffickers. This kind of data has been used in North America to provide real time alerts, with information, about missing children (e.g., Amber Alerts in Canada). The monitoring of social media data by non-government aid agencies can help them better determine how many people might be waiting to access English or French language courses.
Canada has recently provided extensive and almost on-demand data on the Syrian refugee arrivals. Through Immigration Refugee and Citizenship Canada’s online database, the public can now view maps which are uploaded from their website and provide geographically linked information on the number and location of resettled Syrian refugees in Canada. The government of Canada’s “Open Data” initiative provides some data on citizenship uptake, international students, temporary foreign work permit holders and other immigration related issues. These are static tables, but provide additional information for users which were not previously made available. Statistics Canada also provides tabular information based on results from the National Census that allow site visitors to identify various trends in immigrants and labour market identifiers, ethnic identity and second generation and language use among newcomers.
Statistics Canada, together with Immigration Refugees and Citizenship Canada, have been working to provide researchers with even more data opportunities. This is in addition to the wealth of data Statistics Canada provides in both Public Use Microdata files which novice and intermediate statisticians can use to produce their own statistical estimates and equations (e.g., Ethnic Diversity Survey, Longitudinal Survey of Immigrants to Canada, etc.). More advanced data users affiliated with universities can access the master data files of over three dozen current surveys. This allows users to look at small-scale trends and patterns in data that cannot be released to the general public. Although it is possible for independent researchers to gain access to administrative-level data such as the master data file for IMDB which is housed in Ottawa, these opportunities are given to only a few select researchers. There have been some very innovative and recent uses of the data, such as the GIS Mapping Project (Garcea et al., 2016) which uses data from the Immigrant Landing File and IMDB to geo-code landings and other records among immigrants and refugees to Canada’s western and northern regions.