Faking data with Faker.js and Chance.js

I’m increasingly favouring a data-first approach to new applications. By this I mean take a candidate data-model and then bulk out a database with a silly quantity of fake data. Test all your core or risky queries and kick the tires on your indexes, know that it will work, and then start looking to build out your application on a firm foundation. There’s no point waiting until the end to test capacity and performance, as by then it’s way too late.

The advantage of this approach is as you’re building out the app you have realistic data to work with. Producing realistic looking fake data quickly loses its appeal, and for non-trivial models ends up not looking so realistic. So I did some digging around and came across Faker.js and Chance.js. Between the two of them, they make producing quick and dirty instances of a model easy.

So I have a little Node.js http service that sends back as a response a JSON representation of a model instance complete with fake data, and then on the .NET you use a bulk loader to read these instances into your database of choice.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
function createProjectAgents(id) {
'use strict';
var agents = chance.n(
function () {
var agentId = chance.guid();
return {
_type: 'agents',
for: id,
scope: 'project',
role: chance.pick(['admin', 'advisor', 'consultant', 'advocate']),
who: agentId,
agent: createAgent(agentId)
};
},
chance.integer({
min: 0,
max: 3
})
);

Take a gander at the docs, a single snippet doesn’t do them justice. It’s the broad coverage of dozens of functions like faker.address.streetAddress() or faker.phone.phoneNumber() that are the difference between some quick and dirty scripting or a chunk or work that just doesn’t seem worth it.