Processing your email

Rename server/node1.js to parser.js and open it in your editor. It should look like this:

module.exports = function (got) {
  // got.in contains the key/value pairs that match the given query
  const inData = got.in;   
  
  console.log('counter: node1.js: running...', inData.data);

  const json = inData.data.map(d => JSON.parse(d.value));

  json.forEach(function(value, i){
    console.log('datum#', i, 'value:', value)
  })

  return [{
    key: 'dataLength',
    value: json.length
  }];
};

The default implementation is essentially just logging the messages and returning a simple data structure.
The messages are provided as an array of events as our platform may batch up messages if multiple events are available for processing. In the next node we write, we will also see why this proves helpful for doing real work.

Let’s change this implementation to get an email list with a total word count. First we are only interested in emails we receive — we don’t want to compute metrics for the messages we sent so let’s filter them out. Go ahead and add the next line after the declaration of the json variable.

module.exports = function (got) {
// ...
// const json = inData.data.map(d => JSON.parse(d.value));
// json.forEach(function(value, i){
//    console.log('datum#', i, 'value:', value)
//  })
// ...

const filteredEmails = json.filter(j => j.user !== j.from.email);

// ...
};

We simply check that the user (i.e. the owner of the account) is not the person who sent the email.
Next we need to get the text content of the message by doing a cascade selection (falling back to html or an empty string if data can't be found) and clean it up before tokenizing it and counting the words.

module.exports = function (got) {
  // const filteredEmails = json.filter(j => j.user !== j.from.email);

  // ...
  const emails = filteredEmails.map((value) => {
    const text = value.textBody || value.strippedHtmlBody || '';
    return text;
  });
  // ...

We can preprocess the text any number of ways but given that these are pretty common tasks when processing text in emails, we made a tiny library for this. You can import it into your Sift with npm. On the command line:

cd server && npm install --save @redsift/text-utilities
cd .. # to return to sift folder

Then ‘require’ it and use it in your code as you would in any NodeJS application.

// import text util
const textUtils = require('@redsift/text-utilities');

// ...

// inside emails map:
// emails.map(... => {
    const subject = jmapInfo.subject || '';
    const count = textUtils.splitWords(textUtils.trimEmailThreads(text)).length;

    return {
      subject,
      text,
      count,
      id: jmapInfo.id,
    };
// });

We then need to send data to the node output (to the emails-st store)

// ...
module.exports = function (got) {
  // ...
  // const emails.map(...
  // ...
  // });

  return emails.map(value => ({
    name: 'emails-st',
    key: value.id, // let's us lookup/retrieve emails by id according to the schema
    value,
  });
    
}

Run the Sift by select run icon (second down on left) and hitting the run key. You'll see output on the console, but obviously, nothing on the web interface as we haven't connected that up yet.