
The GitHub Repo contains the PowerShell script along with the source files. It imports each of the CSV’s listed above and generates a random number based on the number of records in each file. Like I did for Surname I used the Excel Function to remove duplicates, removed the blanks and the other columns and then had just over 1600 street names. Brisbane City Council (Australia’s largest Council) has a dataset with all bus locations that includes Street names as a CSV.This provides Postcode, Suburb and State. Matthew Proctor’s list of Australian Postcodes as a CSV.

I deleted all other columns so that I was left with just over 13,000 surnames in a CSV file.

Finding previous datasets I’ve randomly generated always seems to take longer than it should, so with my most recent iteration of having to generate a fictitious list of users with Australian addresses, I’ve documented how I went about it, along with the source data I used and the script to create it. Of course I have access to many production datasets but for many reasons they can’t be used. I’ve lost count of the number of times I’ve had the need to generate a representative dataset of users.
