Australian Institute of Criminology

Skip to content

Methodology

The central purpose of the AIC’s research was to investigate, using the best means available, the extent to which international students are the victims of crime in Australia. This, like all research projects of its kind, is complicated by a number of factors, not least of which are the well-known difficulties encountered when attempting to accurately and reliably measure victimisation using police records. For international students (and other minority populations), these problems are more profound because the data needed to identify the relevant populations (students of different nationalities) do not exist. In Australia, no state or territory police service currently mandates the collection of data pertaining to a victim’s citizenship, employment status, education or ethnicity; yet for the reliable identification of international students, all four data items are required.

In reality, criminal justice databases in Australia are specifically designed to meet operational needs. In the absence of the mandatory data collection, those police officers involved in the investigation of crimes are likely to only record information offered to them by the victim at the time of interview, or information that is otherwise identified during an investigation that is considered relevant to the apprehension of an offender and the prosecution of a case. Information deemed not germane to a case by the investigating officer is either not officially recorded or not specifically sought on the grounds that collecting it may be perceived as facilitating discrimination within the criminal justice system.

Therefore, in order to provide an estimate of the extent to which international students have been reported as the victims of crime, alternative methods were needed to identify the relevant population of international students that did not rely on non-mandatory data items captured by the police. To that end, a methodology was devised based on the cross-matching of state/territory police case records with data on international students held by DIAC. Data-sharing provisions were negotiated with participating law enforcement agencies and DIAC, after a Temporary Public Interest Determination (TPID) was granted by the Privacy Commissioner, which enabled the release of Commonwealth data for research purposes and ethics approval by the AIC’s own Human Research Ethics Committee.

A secondary purpose of the research was to identify whether international students experienced victimisation at rates higher than the general Australian population. To do so, student victimisation rates were compared with age-adjusted victimisation rates for the relevant state population averages. These population averages, where available, have been identified using data from the ABS Recorded Crime Victims report (ABS 2010a). Procedures governing the data collection and analysis are detailed below.

Sample

Given the limits of existing police databases to identify international students as victims of crime, an alternative methodology was devised using records from DIAC. DIAC is the Australian Government agency principally responsible for the administration of the international student visa program and maintains a database of applications and arrivals in Australia. DIAC data was therefore the best source for accurately identifying the ‘sample’ of persons for whom crime victimisation was to be measured in this study.

For the present study, the international student sample comprised all successful international student visa applicants who arrived in Australia between 1 January 2004 and 18 May 2010 (the data extraction cut off point) from one of five source countries—India, People’s Republic of China, Republic of Korea (South Korea), Malaysia and the United States. These five countries were selected as they contributed the largest number of students to the Australian international student sector between 2004 and 2010.

A total of 496,902 individuals were identified in the DIAC database. Of these 445,615 (90%) were primary applicants (ie seeking to study at an Australian institution), while 51,280 (10%) were secondary applicants. For the purposes of this study, a secondary visa applicant was any person accompanying a primary student visa holder, but who was not necessarily going to study at an Australian institution themselves. Secondary applicants most often include the spouse or child(ren) of a primary applicant.

Of the 445,615 primary applicants identified, 177,847 (40%) were from the People’s Republic of China, 128,251 (29%) were from India, 55,989 (13%) were from the Republic of Korea, 42,784 (9%) were from the United States and 40,744 (10%) were from Malaysia (see Figure 2).

Figure 2: International student arrivals by nationality in all states, 2004–10 (%)

Figure 2: International student arrivals by nationality in all states, 2004–10 (%)

Note: Percentages may not sum to 100 due to rounding

Source: AIC International Student Victims of Crime 2010 [computer file]

Jurisdictional analysis

Australian state and territory jurisdictional analysis of student visa numbers was made possible using the CRICOS identification number. When applying to study in Australia, each student must nominate the course or institution at which they intend to study. These institutions or courses are allocated a jurisdictionally unique identification number (CRICOS) which can later be used to identify each student’s jurisdiction of arrival.

CRICOS numbers were not recorded for 45,631 students in the database (9%), although this was not evenly distributed across each of the five source countries. The Republic of Korea had a particularly high level of missing data, with 44 percent of primary applicants having no recorded CRICOS number. For the other source countries, missing data was not substantial—less than one percent for primary applicants from China and less than five percent for primary applicants from India, Malaysia and the United States.

Since the analysis in this study is conducted at a jurisdictional level, and there was no viable solution to assigning those students with a missing CRICOS number to particular jurisdictionsi, these students have been excluded from the analysis. The results presented throughout the report are, therefore, relevant only to the population of students known to have been studying in each jurisdiction. Further, it is important to note that the CRICOS identification number for students who move interstate to an alternative institution or course after arriving in Australia is not recorded or updated in the DIAC database. Therefore, CRICOS numbers pertain only to the course or institution of intended study upon arrival and population estimates in this study do not account for interstate transfers.

Of the 445,615 primary applicants with a known CRICOS number, 35 percent were listed as studying in New South Wales, 34 percent in Victoria, 15 percent in Queensland, seven percent in both South Australia and Western Australia, two percent in the Australian Capital Territory, one percent in Tasmania and less than one percent in the Northern Territory. There were some notable jurisdictional differences for each of the respective source countries. For example, a larger proportion of Indian (49%) and Malaysian (38%) students were studying in Victoria than in any other jurisdiction, while New South Wales accounted for the largest share of students from the People’s Republic of China (43%), Republic of Korea (47%) and the United States (42%). The small number of international students studying in the Northern Territory from 2004 to 2010 precluded any further analysis of the NT data.

Data matching

Following the identification of the DIAC student sample, two databases were constructed. The first, containing only the names and dates of birth for all students, was sent to each state and territory police agency to facilitate and name and date of birth search of victimisation records in the relevant reference period. The second, containing only de-identified information about each student (gender, age, CRICOS number etc), was sent to the AIC. Both databases contained a unique student identification number, which later facilitated a data match between the victimisation records returned by each police agency and the de-identified DIAC data held by the AIC, thus limiting access to the minimum of identifying information and ensuring individual students could not be identified in the resultant sample used for analysis by the AIC.

Underpinning the victimisation record search was a complex set of search parameters developed by the AIC. These parameters were developed in consultation with each of the states and territories in an effort to minimise any potential inter-jurisdictional bias in the data collection process. The data matching was conducted separately by each agency on the full complement of student data provided by DIAC, including secondary applicants and those whose CRICOS number was not known (n=496,902).

The final matching process was comprised of two waves. In the first wave, records were extracted for all episodes of victimisation in which the victim’s full name and date of birth were an exact match to a student in the DIAC database. In order to allow for misspelling of student names, a second wave of matching was undertaken using Soundex—a phonetic algorithm that converts names into an alpha-numeric code based on its phonetic structure. By using Soundex, two names spelled differently but with the same phonetic structure can be considered equal for the purposes of matching. In this second wave, episodes of victimisation were extracted for those cases where there was an exact match on a person’s full Soundex name and date of birth.

Both waves were deemed important in this study. The first wave, though the more stringent and least prone to type one error (false positives), was also likely to miss a proportion of cases for which there was a typographical or other data entry error (resulting in false negatives). Given the diverse range of cultural and ethnic backgrounds from which the study sample was drawn, it was reasonable to suspect that a sizeable number of records would be missed during a search in which only an exact matching procedure was employed. The use of Soundex in the second wave, therefore, provided a quasi control for data entry errors since the same phonetic structure could be matched together regardless of small nuanced differences in spelling. For example, using Soundex, the following three names, ‘Jason Payne’, ‘Jaysen Pain’ and ‘Jaesun Paine’ would receive the same code—J250 P500.

Finally, the search process was cumulative such that only new records (those not previously identified during wave one) were extracted during the second wave. Each record was given a final code (1 or 2) depending on for which wave of the search process it was extracted.

Each of the eight Australian state and territory police agencies created a separate database of victimisation records, which were assigned a unique student identification number and for which all identifying information was removed and returned to the AIC. Using the unique student identification number, each database was cleaned and coded to ensure inter-jurisdictional comparability, merged together and later matched to the de-identified DIAC data provided to the AIC.

The initial database contained 23,732 victimisation records for all possible offence types. Of these records, a proportion were later identified as ineligible for inclusion in the final analysis. This was mostly a result of duplicate records or records that were incorrectly selected during the application of Soundex in the matching process. Further, some records were for offence types (disorderly conduct, breaches, traffic and driving offences) which could not be reasonably counted as incidents of victimisation based soley on the offence description alone. These incidents were excluded from the analysis. Finally, a number of offence types (eg sex and fraud offences) were excluded because sample sizes and offence numbers were insufficient to conduct reliable comparative analysis at a jurisdictional level. Of the remaining data, three key offence types—assault, robbery and other theft—were chosen for comparative analysis. In all, the final database contained 13,204 unique victims (3% of all students) who reported a total of 14,855 records of assault (n=3,201), robbery (n=3,206) and other theft (n=8,440)

Research approval

The data matching procedures required for this project involved the transfer of identified data (names and dates of birth) between DIAC and each of the eight state and territory police agencies as well as de-identified data between DIAC, the police and the AIC. As a result, the project was subject to a number of approval processes, most notably written approval to proceed provided by the Police Commissioner in each jurisdiction, approval by the AIC’s Human Research Ethics Committee (protocol PO156) and a TPID from the Office of the Privacy Commissioner (Australian Government).

Temporary Public Interest Determination

DIAC protocol dictates that the transfer of potentially identifying information between agencies, without consent, and for purposes not previously indicated at the time of collection should be subject to consideration by the Office of the Privacy Commissioner. In this study, the information needed to facilitate a victimisation record search was not originally collected for the purposes of research, nor had the students at the time of completing their application been asked to consent to the transfer of that information for research purposes. Moreover, consent was unable to be obtained specifically for the purposes of this study because it was not possible to achieve consistent or reliable contact with those students included in the sample because:

  • residential details are not routinely collected by DIAC at the time of application and in cases where they are recorded, the addresses mostly relate to the students’ normal place of residence outside Australia;
  • given the short-term nature of the student visa program, many students will have since departed Australia at the completion of their study. No address is collected by DIAC at the time of departure from Australia; and
  • even where DIAC holds an Australian residential address, post-arrival movements by a student are not collected or updated in the DIAC database.

As a result, it was necessary for DIAC to submit an application to the Office of the Privacy Commissioner for a TPID. A TPID is the vehicle through which Commonwealth agencies can apply for a time-limited relaxation of key privacy principles, in so far as the use of personal or identifying information is considered in the public interest. In this case, a TPID was made on 5 May 2010 and provided short-term approval for the transfer and use of identified student data, without consent, subject to the following conditions:

  • the provision of personal information held by the department relating to student visa holders and former student visa holders would be a once-only arrangement solely for the purposes of assisting the AIC research;
  • the data would be delivered as an electronic file by bonded courier to the relevant police jurisdiction;
  • the relevant police jurisdiction would provide written notification to DIAC upon receipt of the data;
  • the relevant police jurisdiction would use the personal information provided by DIAC solely for the purpose of undertaking a one-off data matching exercise against the details of victims of crime in incidents recorded by police between 1 July 2004 and the date of the data match;
  • the relevant police jurisdiction would ensure the security and privacy of the personal information provided in relation to this arrangement was protected by password at all times;
  • the disclosure of personal information by DIAC was subject to Commonwealth legislation and guidelines that govern the protection of information, secrecy obligations and general conduct. These include, but are not limited to:
    • the Privacy Act 1988 (the Privacy Act);
    • Privacy Act 1988 Part VI—Temporary Public Interest Determination No. 2010–1 3;
    • the Crimes Act 1914;
    • the Public Service Act 1999;
    • the APS Values and Code of Conduct; and
    • the Australian Government Protective Security Manual.
  • the relevant police jurisdiction acknowledges the Privacy Act applies in respect of the provision of data under this arrangement;
  • the relevant police jurisdiction would not intend to do any act or engage in any practice that would breach the privacy and secrecy provision under which the relevant police jurisdiction operated;
  • the relevant police jurisdiction would ensure that there was no merging, matching, exchange or any other forms of interaction between personal information obtained during the course of providing services under this arrangement and other data sets, or other information held by the police jurisdiction;
  • the relevant police jurisdiction would ensure that any employee of the service required to deal with personal information for the purposes of this arrangement would be aware of the obligations set out in these conditions and would undertake to comply with these obligations;
  • the electronic file was securely stored inside a B class security cabinet (or equivalent) while in possession and control of the relevant police jurisdiction;
  • the relevant police jurisdiction would provide the matched data to the AIC de-identified, in the form of the Person ID with detail of the police-recorded incident;
  • upon completion of the data-matching exercise, as advised by the AIC, the relevant police jurisdiction would destroy the hard copy and delete all digital copies of DIAC’s data file in their possession that relate to this arrangement, unless advised otherwise;
  • the relevant police jurisdiction would provide written notification to DIAC upon deletion of all such files; and
  • the relevant police jurisdiction agrees to immediately notify DIAC if the service becomes aware of a breach or possible breach of any of the obligations contained in these conditions.

In addition to these conditions, DIAC also facilitated an ‘opt-out’ process whereby students could elect to be removed from the study. Notification of the opt-out process was placed on the DIAC website, as well as on the websites of the Australian Embassy or High Commission in India, People’s Republic of China, Republic of Korea, Malaysia and the United States. Students electing to opt-out of were provided with an opportunity to complete a brief opt-out survey, designed by the AIC to collect brief demographic and self-reported victimisation data. The opt-out period lasted for 21 days, ending on 30 May 2010. In all, 11 students opted-out of the study and were excluded from the data transfers and analysis.

Coding and counting rules

Episodes of victimisation were classified by each of the police agencies according to the ABS Australia Standard Offence Classification (ASOC; ABS 2008b). Most other data (ie incident location, time of day, day of week) was provided to the AIC in the format originally recorded by each police agency. Once received, the AIC developed a coding framework that could later be applied consistently to all jurisdictional data extractions, with some exceptions as outlined in the respective jurisdictional data summaries.

Further, to maximise data consistency between jurisdictions and to ensure comparability between these international student data and those recorded in the ABS Recorded Crime Victims database, a number of standardised counting rules were applied:

  • The ABS counting rules stipulate that multiple incidents of victimisation occurring for the same victim on the same day and being of the same offence type are counted as a single incident.
  • Offences of similar type are identified as those coded within the same division of ASOC (ABS 2008b). For example, two assaults recorded for the same student on the same day would be classified under these counting rules as one incident of victimisation. Alternatively, one assault and one robbery for the same victim on the same day would be classified as two incidents.

Counting rules such as these are an important tool in criminological research for controlling inconsistencies within and between administrative databases, including duplications that result from differing operational practices (see Payne 2007). They result from the recognition that administrative databases are prone to duplication—a particular problem for victimisation data in which there was more than one offender.

A negative consequence of using counting rules to assist with comparability is that the type of offences able to be compared is restricted. Three offence types are examined in this study:

  • Assault—the direct infliction of force, injury or violence upon a person, including attempts or threats. It excludes sexual assaultii. In this study, episodes of assault were identified for any offence recorded within subdivision 02 (0211 and 0212) of ASOC (ABS 2008b).
  • Robbery—the unlawful taking of property, without consent, accompanied by force or threat of force. In this study, episodes of robbery were identified for any offence recorded within subdivision 06 (0611 and 0612) of ASOC (ABS 2008b).
  • Other theft—as the taking of another person’s property with the intention of permanently depriving the owner of the property illegally and without permission, but without force, threat of force, use of coercive measures, deceit or having gained unlawful entry to any structure even if the intent was to commit theft. This offence includes such crimes as pick pocketing, bag snatching, stealing (including shoplifting), theft from a motor vehicle, theft of motor vehicle parts/accessories or petrol, theft of stock/domestic animals and theft of non-motorised vehicles/boats/aircraft/bicycles. It does not include the theft of a motor vehicle itself. In this study, episodes of other theft were identified for any offence recorded within subdivision 08 (0813 to 0841) of ASOC (ABS 2008b).

In order to ensure comparability with ABS-derived average victimisation rates, each jurisdiction provided victimisation records consistent with data provided to ABS, with one exception:

One nuance of the robbery data lies in the definition of a victim. The ABS determines a victim of robbery on the basis of property ownership; that is, a person or organisation is considered a victim of robbery if they incur a loss of property and that if the

robbery only involves property belonging to an organisation, then one victim (ie the organisation) is counted regardless of the number of employees from which the property is taken (ABS 2010a: 111).

Given the focus in this study was on international students as victims, where an international student was present at the time of a robbery they were counted as a victim. An examination by the AIC of duplicate incidents of robbery that potentially involved secondary victims revealed that this was only an issue in a small number of cases and the impact of this on the rate of robbery of international students was likely to be negligible.

Population estimation

Much of the analysis in this report is undertaken using annual rates of victimisation per 1,000 of relevant population. That is, the number of incidents of crime victimisation each year is divided by the known population for that year and multiplied by a common denominator (1,000). This standardisation ensures that the prevalence of victimisation is comparable between samples and over time, net of the effect of population differences.

The calculation of a crude victimisation rate is relatively straightforward using ABS recorded crime victim data, since the only available population estimates are from the Australian Census. In this study, the recorded number of crime victimisation incidents and the Australian Census annual population estimates were used, with separate annual estimates identified for each jurisdiction, by age and gender (ABS 2006).

For the international student victim database, however, the calculation of victimisation rates was somewhat complicated by the highly variable and short-term nature of the international student education sector. Across the year, the population of students studying within a jurisdiction fluctuated with the arrival of new students and the departure (or movement to an alternate visa type) of completed students. In the absence of a single estimate a number of possible options were canvassed in this study.

First, an estimate of the total number of students studying in each jurisdiction at any time throughout the year was calculated, irrespective of how long each student was actually living/studying in that jurisdiction. Though a true count of the cumulative number of unique students living in a jurisdiction throughout the year, this estimate would produce the largest of all possible options since it fails to account for the reality that not all students spend a full 365 days living and studying with their jurisdiction. For example, some may have finished their studies in January (after only 1 month in the relevant year) and returned home shortly after; others may have started their studies in October and arrived in Australia shortly before. These students—those who spent less than a full 365 days living in Australia for the relevant year—were not ‘at risk’ of victimisation for the entire year in which they were counted, yet they would be treated as such under this simple annual population count. Further, this estimate includes no procedure for standardisation, meaning that the estimates may be biased against countries with a greater tendency towards short-term study arrangements. Overall, this method was rejected as counting short-term students as whole units within the relevant population would inflate the denominator and consequently deflate the rates of victimisation.

A second method, consistent with ABS population estimates, was developed using a census date. In this method, a single date (8 August) was chosen as the census date and all persons known to be living within each jurisdiction on that date were counted as the ‘at-risk’ population. The underlying assumption was that the population living within a jurisdiction on the census date would represent an average of the fluctuations in the population across the entire year. ‘Census date’ population estimation is a common method used in social and criminological research, including the AIC’s National Police Custody Survey (Taylor & Bareja 2005) and Juveniles in Detention Monitoring Program (Richards & Lyneham 2010).

Despite its simplicity, the census date population estimates were not used in this study. A test of their reliability (ie the extent to which they represented the population average) was performed comparing six randomly selected census dates in each annual period. The results illustrated that no single census date (including 8 August) provided a reasonable estimate of the ‘at-risk’ population average across the year. Further, it was found that the bias associated with any single census date was distributed unevenly between the five source countries and across each of the five years, limiting the reliability of both the time series and between-country comparisons.

It was therefore decided to determine the annual ‘at-risk population’ by a third method, based on the number of days each student was known to be studying within each jurisdiction each year. The sum across all students in the relevant year therefore represented the total number of ‘at-risk’ days. Dividing by 365 yielded a standardised estimate of the number of full-year students. Under this method, students who lived in a jurisdiction for only part of the year contributed to only part of the final population estimate. For example, a student that studied for six months in 2005 and six months in 2006 would contribute 180 ‘at-risk’ days (or 0.5 full-year persons) in each of the two years, respectively. Despite being mathematically intensive, this method provided the best mid-point for a highly variable population. It also standardised the population estimation procedure for each of the five source countries and across each of the five annual periods, improving comparability.

Weighting

It became apparent during the early stages of data analysis that significant gender and age differences existed between the five source countries in each state and territory (see jurisdictional summaries). Further, the general age and gender profile of students differed significantly from the profile of the broader population for which the jurisdictional recorded crime rates were calculated. Without some adjustment, these differences would be likely to have significant implications for the comparability of the resulting estimates.

In New South Wales in 2005 for example, 67 percent of male Indian international students were aged between 20 and 24 years. This compared with 62 percent of male students from the People’s Republic of China, 25 percent of male students from the Republic of Korea, 59 percent of male students from Malaysia and 41 percent of male students from the United States. The combined average of males aged 20–24 years was 51 percent. By contrast, according to ABS population figures, males aged between 20 and 24 years comprised only 16 percent of all males living in Australia aged between 15 and 45 years. It is clear from this data that male international students were disproportionally younger than the overall Australian population in New South Wales and that Indian male students were the youngest student population of all five source countries.

To correct for this bias, data weights were calculated as proportional to the average age and gender distribution across each of the five source countries. These weights, easily conceptualised as multipliers, help to realign each of the respective populations to an equal age and gender distribution, making between-country comparisons more reliable. In the example of New South Wales above, male students aged between 20 and 24 years comprised an average of 51 percent of all male international students studying in 2005. To readjust the data, those countries with a higher than average number of 20–24 year old males (India, the People’s Republic of China and Malaysia) were weighted down within that age category, while those with a lower than average proportion were weighted up (Republic of Korea and the United States). Weights were proportional to the average such that Indian males aged between 20 and 24 years, for example, were given a final weight of 0.76 (51%/67%=0.76).

Once calculated for each age and gender combination in each jurisdiction, the weights were then used to multiply the crime victimisation counts to derive the final weighted victimisation rates per 1,000.

Statistical testing and confidence intervals

Conventional statistical testing is commonly used to identify the likelihood that differences between two groups in a sample of the population are true differences not influenced by sample bias, error or study design. Traditionally, the test level used in criminological and social science research implies that the probability of error in the sample estimate is less than five percent (p<0.05). That is, a difference between two groups is considered statistically significant if the same analysis, conducted on 100 different samples from the same population, would yield the same result more than 95 times.

The purpose of statistical testing, therefore, is to measure the extent that differences in a sample truly exist in the larger population from which the sample was originally drawn. However, if the estimates themselves are derived from an examination of the whole population in question, then statistical testing is redundant. This is because differences within and between groups in the whole population are true differences, not subject to sampling or research design error (de Vaus 2002).

In the present study, victimisation data was collected for the entire known student population in each jurisdiction, not a sample of each population. Further, the ABS recorded crime data for which comparisons are made in each jurisdiction are also whole population estimates. As such, differences in the rate of victimisation between the five source countries and the comparative jurisdictional populations are actual differences not likely to have been influenced by the probability of sampling error or design bias. For these reasons, conventional statistical testing is not used in this study.

In lieu of conventional tests, victimisation rates are presented in this report with both upper and lower bound confidence intervals. A confidence interval represents the upper and lower limits of the point estimate, taking into account the possibility that (despite being whole population analysis) some error may have been introduced during the extraction, collation and coding of the relevant data—a common problem in criminological research generally.

The confidence intervals used in this study are calculated using the Poisson distribution—an alternative to the normal distribution that is specifically designed for count variables where the number of events over a fixed time interval is assumed to occur at random, independently across time and at a constant rate. Count variables (ie the number of incidents of victimisation) have a number of special properties that do not meet the standard assumptions of the normal distribution. In particular, count variables and their associated rates are bounded at zero, meaning that they cannot enter into the negative parameter space. In the present study for example, there cannot be -1 incidents of victimisation, nor can there be a rate of -1 incidents per 1,000 of the population. The Poisson distribution is among the more commonly used alternatives that accounts for these special properties.

The formula used for the calculation of Poisson confidence intervals is:

where Y is the number of events, Yl and Yu are the lower and upper limits of Y, respectively, and X2n,a is the chi square quartile for the upper tail probability with n degrees of freedom. Statistical proofs for the relationship between the Poisson and chi square distributions are described elsewhere (Ulm 1990).

In this study, confidence intervals are a useful tool for examining the differences between whole population samples, taking into account the likelihood that some errors may exist in the data collection, extraction and manipulation phases of the research. Where two population estimates are different and where their respective confidence intervals do not overlap, these differences are interpreted as having a high degree of statistical reliability (95%), that is, to be real differences. Alternatively, where two population estimates are different but where their respective confidence intervals overlap, the differences are interpreted as not sufficiently reliable to draw conclusions about the real differences between the populations in question.

Finally, as implied from the formula above, it is important to remember that the width of a confidence interval is dependent on the size of the population from which the estimate is drawn. Small populations yield large confidence intervals since the reliability of an estimate increases as the size of the probable influence of error decreases. This has particular implications for smaller jurisdictions (Australian Capital Territory and Tasmania), as well as the source countries that have relatively small populations. The AIC has been conservative in drawing conclusions as a result of these data limitations.

In addition to confidence intervals, relative standard errors (RSEs) are used to indicate point estimates (rates) that are considered statistically unreliable. An RSE measures the relationship between the rate’s standard error and the rate itself. Where the standard error is equal to 25 percent or more of the estimate (RSE>=.25), the rate is indicated accordingly.

Limitations

This research into the victimisation experience of international students is the first of its kind in Australia to use an Australian Government agency database (DIAC) in a complex data matching procedure to conduct population-level analysis of crime victimisation rates across state/territory jurisdictions. There are a number of limitations to this study that must be taken into consideration when interpreting the results.

Recorded crime victimisation

Crime victimisation is notoriously difficult to measure for a number of reasons, not the least of which is that the two most common methods (administrative data collections and self-report surveys) are subject to a number of limitations which affect their reliability and generalisability.

Administrative data (typically police data) pertain only to those incidents of crime that have been detected by, or willingly reported to the police. These reports may come from bystanders or police personnel who witness a crime take place, but more often than not come from the victims of the crimes themselves. In any case, it is widely recognised that administrative data collections are underestimates of the true nature and extent of victimisation because a sizable proportion of all victimisation that occurs in Australia is never reported to the policeiii.

Exactly how much crime goes unreported is subject to some considerable debate; however, the most recent results from the ABS Crime Victimisation Survey shows that only 39 percent of those who had been a victim of assault in the last 12 months reported to the police on the last occasion (ABS 2010c). Further, only 29 percent of self-reported robbery victims and 23 percent of those who had been the victim of physical threats reported the offences to police.

Further, not only does the probability of reporting vary by offence type, but it is also likely to be influenced by the nature of the offence and the characteristics of the victim. Domestic or intimate partner violence is one notable example of where the nature of an assault (ie the involvement of a closely related perpetrator) is likely to influence a victim’s willingness to report to the police. Other studies have found that victims are less likely to report to the police if they had been drinking alcohol at the time of the incident or if they themselves were involved in some illegal activity (eg Sweeney & Payne 2011).

Of particular importance to this study are the various cultural barriers that may affect differential crime reporting rates among international students and how these differences may affect the reliability of the comparisons used in this report. Language is an obvious example, where some international students may be less inclined to report to the police because of concerns they will not be able to clearly articulate what had happened. Others have suggested that students from certain countries may be more inclined to report a crime to their respective high commission or embassy, instead of the police, because of local customs or practices in their home country (C Nyland personal communication 2010).

In reality, many of the possible factors influencing reporting rates have not yet been, nor can they be, tested in this study. As is the case with all such recorded crime victimisation studies, the current study is only able to analyse those crimes reported to the police. Thus, the results must be interpreted with the knowledge that under-reporting exists and that the rate of under-reporting may vary between each of the five source countries and between international students generally and the broader jurisdictional population as a whole.

Data matching

Underpinning the reliability of the results presented in this study is the quality of the data matching procedures used to extract victimisation records. Although significant effort was made to ensure that the extraction procedures employed by each jurisdictional police agency were comparable, in reality, differences in system and operational procedures have the potential to introduce some jurisdictional variance that may limit the reliability of between jurisdictional comparisons.

Further, as illustrated earlier, Soundex was used as an important tool to aid in the identification of victimisation records for persons whose names had been misspelled or mis-entered. The use of Soundex was justified on the basis that foreign names are those with the highest likelihood of data entry errors and that the failure to identify these records may have inadvertently resulted in lower than actual victimisation rates.

However, it is important to note that the use of Soundex may introduce other issues that, although unable to be quantified, should be considered when interpreting these results. In particular, it is not entirely clear whether Soundex is equally reliable for foreign spoken names whose phonetic structure differs from English spoken names. Further, it is not clear whether any such bias, if it exists, relates more or less to specific names and languages such that the reliability of Soundex may have significantly differed between the five source countries included in this study.

In the absence of any reliable test for such a bias, the benefits derived from the use of Soundex were considered to outweigh the potential disadvantages. Nevertheless, it is important to recognise that no administrative data matching procedure is 100 percent accurate and that this study, like all others of its kind, seeks to maximise reliability by finding the point at which false negatives (records that are missed) and false positives (records that are incorrectly matched) are both likely to be minimised.

Unobserved heterogeneity

In an effort to maximise the comparability of victimisation rates for each of the separate source countries and jurisdictional population averages, results have been presented separately for males and females, and weighted proportional to the average of the student age distributions in each year. Although these efforts are likely to have mediated the impact of some demographic bias, there may still be other unobserved differences existing between the samples that may influence the results. Some of these differences might include marital status, access to financial resources and employment history, just to name a few.

These differences, commonly referred to as ‘unobserved heterogeneity’, are many and varied and exist in all research studies. They serve as an important reminder that even after all possible controls are applied differences may still exists between samples because of something that remains unobserved in the data.

In this study, since gender and age are controlled, differences in the rate of victimisation may be used to infer differential risk associated with a person’s race. However, it is critical that such a conclusion be considered carefully in the context of unobserved factors, other than race itself, which may exert greater influence on victimisation. Routine activities theory, for example, has earlier been discussed as one of the more relevant criminological explanations of international student victimisation. According to the theory, it is possible to hypothesise that where and what type of employment a person undertakes is likely to influence their risk of victimisation. If there are unobserved differences in average employment status of students from different countries, and if these differences aren’t controlled, then it is possible that the ‘race’ effect is partly due to difference in employment, rather than physical appearance or cultural practices.

Operational and other data recording bias

It is likely that some of the data collected during this project is subject to operational and other data recording biases, which may or may not affect the consistency of the data and the subsequent interpretation of the results.

Incident locations, for example, may be subject to differential recording practices depending on local or state protocols, or may be otherwise influenced by differential recording between different police officers within a jurisdiction. Where one officer might record the robbery of a taxi driver as an ‘on the street’ offence, another might record it as ‘in a taxi’. Moreover, where a victim reports to the police over the phone or in person sometime after the incident occurred, what subsequently gets reported as the location and time of the incident might be influenced by the victim’s ability to recall their experience in the same detail that would be evidenced for offences that are investigated by the police shortly after the incident occurred.

Further, recording bias may also occur when recording the victim details for offences occurring at service stations or taxis, where the attendant/driver happens to be a foreign student but the victim may be the owner. Finally, there is any number of possible issues that contribute to the consistency and accuracy of police victimisation data in general. These issues affect all research of this kind and are not specific to this study.


End notes

i An alternative method of random redistribution was considered; however, the bias introduced by such a process was unlikely to improve the reliability of the final results. Here, students with an unknown CRICOS number would be re-allocated to each of the eight states and territories according to the jurisdictional distribution of students with a known CRICOS number. In practice, 47 percent of South Korean students with an unknown CRICOS number would be randomly allocated into the NSW student population count because 47 percent of those with a known CRICOS number were from New South Wales. The basic assumption of this random redistribution is the absence of any jurisdictional bias between those with and without a CRICOS number. However, since the CRICOS number is institutionally unique, it is probable that missing data pertains to specific institutions in specific locations and therefore, the assumption of equality in distributions is likely to be spurious. Moreover, throughout this report some effort is made to apply population weightings for comparative analysis. These weightings are based on age and gender distributions which, as illustrated later, are not equal between each of the five source countries. The random redistribution of missing cases would not adequately account for these variations and likely bias the results.

ii The number of sexual assault cases over the period 2004–10 involving students was approximately 200. The low number of reported cases and the use of ABS offence classification prevented comparative analysis.

iii Self-report surveys are a useful alternative in that they typically ask respondents to report if and how often they have experienced a particular crime type, irrespective of whether they reported the incident to the police. However, such surveys are not only expensive to run, they take longer to generate results and are difficult to target to minority populations (including international students). Each of the various methods used to interview respondents (telephone interviews, drop and collect, online surveys) have issues that ultimately limit the final analysis (see de Vaus 2002).