Anzacs and Soldiers from the Indian Subcontinent

Team Name: 

Soldiers from the Indian sub-continent formed the largest number of soldiers in World War I. Over one million soldiers from the Indian Subcontinent fought in Africa, Europe and the Middle East. Approximately 74,000 soldiers from the sub-continent were killed during the War. Some Australian soldiers mentioned the work of the Indian soldiers in World War I, but over the years the significant Indian contribution has been forgotten.

This project reveals the names of Australian and New Zealander soldiers born in the Indian subcontinent, comments Australian soldiers made about the Indian soldiers in their diaries and photos relating to Indian soldiers available from Australian archives. 


This project is building a website which will be a resource for people with ancestors from the Indian subcontinent to do family history research and to understand more about the significant participation of soldiers from the Indian subcontinent in the Great War. This is designed to appeal to the South Asian diaspora who may be curious about the role of their forebears during the Centenary. It is also hoped that it will be used by people living in South Asia as well as those people who come from Anglo-Indian families.

What Our Project Does

We have produced a website where the user can search for soldiers by surname, by military unit and by Indian place. There is a heat map (just a place mark because we did not have time to clean the data and do the geocoding of historic Indian place names) which shows at a glance the number of people from the regions of the Subcontinent who were involved in the War. There is an interactive Word Cloud (a place marker at the moment) which allows people to click on words and see the articles which mentioned those words, drawn from our database of World War I Australian soldier diary pages which mention the words "Indian AND Indians NOT Ocean".

People can also explore the diary pages where Indians are mentioned via a page with a histogram showing the distribution of the dates when those comments were made. This requires manual coding of the dates of the diary pages - something which we did not have time to do. People can then click on a place on the histogram and browse through the pages.

Sensitive Issues

We need to be alert to issues surrounding the exposure of sensitive information when using any historical material. The material we are using may contain terms people now regard as offensive such as "nigger" and the use of words regarding dirtiness when used about people of different racial backgrounds. The Great War took place in an era when there were discriminatory government policies such as the White Australia Policy, segregation and other forms of onerous, systematic disadvantage directed towards non-European people. 

We should not censor the history by removing these references, but we intend to have a warning flash up every time a potentially offensive diary comment is on screen. It is also intended that there be a page that explains this history to the modern reader.

This website will not distinguish by race. Soldiers are named because they were born in India or served with the Indian forces. Many of the officers of the Indian units were British. Given the inter-marriage between British and Indians from the early days of the East India Company it is not possible to gauge someone’s race by name or by visual features. 

This website refers to the entire Indian subcontinent, not just the nation that is now known as India. Thus soldiers from the countries now known as India, Pakistan, Bangladesh, Burma and Sri Lanka are included. Where the word India is used on the website, it refers to the Indian Subcontinent or ‘United India’. 

Tools Used

The data was downloaded via APIs, CSV downloads and the Kimonolabs tool. Data cleaning was done through OpenRefine. Most of the data used by the website is in .csv format except for the database of the State Library of NSW WWI soldier diaries. The website was built using

This project is a proof of concept which can be extended in a large number of ways.


Video Upload: 
Datasets Used: 
- National Archives of Australia WWI service records API:”India”&rows=2000&page=1&app_id=<app_id>&app_key=<app_key>. We used OpenRefine to clean this data. - Auckland War Memorial Museum Online Cenotaph: Did a manual search - Place of Birth=”India”, War=”World War I” We received 12 results so hand-wrote these in our War Service Records database. Did a manual search of surnames "Khan" and "Singh". Found 2 records which we manually added to our database - Australian War Memorial photographs. We could not get the API to work to download photos so we constructed a nested API on this search URL: The API we created through Kimonolabs opened each of the 493 results of this search, opened the pages and downloaded the captions, the source image URL etc. - SLQ Photographs 1914-1918: We downloaded the CSV file from and manually searched for each instance of the word "India". We had to manually filter for references to American Indians. We found 3 images and downloaded 500pxl versions of them. - The State Library of NSW transcribed diaries: We downloaded the XML transcripts made available. We wanted to access references to Indians in The Great War from the Trove database and we would have liked to include the details of nearly 80,000 deceased British Army soldiers from the Commonwealth War Graves Commission but we ran out of time.

Local Event Location: