Cassandra Routing Keys in the DataStax C# Driver

I recently read an article by a DataStax solution architect stating that to get the best insert performance using unlogged BatchStatements, all statements in the batch should belong to the same partition so that the coordinator node doesn’t have to re-route statements for partitions it doesn’t own. There is a problem with this advice however (as shown in my comments at the bottom of that article) in that as of this writing, the DataStax C# driver uses RoundRobinPolicy as the default load balancing policy (in contrast to the Java driver that uses TokenAwarePolicy wrapping a DCAwareRoundRobinPolicy). This is now acknowledged in this bug.

From https://github.com/datastax/csharp-driver/blob/c891b57c9f3cf52a75ccb888bf76fe0dad452afd/src/Cassandra/Policies/Policies.cs:

/// <summary>
///  The default load balancing policy. <p> The default load balancing policy is
///  <link>RoundRobinPolicy</link>.</p>
/// </summary>
public static ILoadBalancingPolicy DefaultLoadBalancingPolicy
{
	get
	{
		return new RoundRobinPolicy();
	}
}

From https://github.com/datastax/java-driver/blob/2.1/driver-core/src/main/java/com/datastax/driver/core/policies/Policies.java:

/**
 * The default load balancing policy.
 * <p>
 * The default load balancing policy is {@link DCAwareRoundRobinPolicy} with token
 * awareness (so {@code new TokenAwarePolicy(new DCAwareRoundRobinPolicy())}).
 *
 * @return the default load balancing policy.
 */
public static LoadBalancingPolicy defaultLoadBalancingPolicy() {
	// Note: balancing policies are stateful, so we can't store that in a static or that would screw thing
	// up if multiple Cluster instance are started in the same JVM.
	return new TokenAwarePolicy(new DCAwareRoundRobinPolicy());
}

The problem here is that even if you do group batches by partition, the RoundRobinPolicy does nothing to ensure the coordinator actually owns that partition and the effort will be wasted. Thankfully the C# driver includes a TokenAwarePolicy, but how to use it, and especially how to set the routing keys of BatchStatements seems to be completely undocumented. I had to trace through code to find out how it works, so hopefully this article will save somebody else the trouble until DataStax brings the C# driver in line with the Java driver.

First step, set the policy and connect:

ILoadBalancingPolicy childPolicy = Policies.DefaultPolicies.LoadBalancingPolicy;
ILoadBalancingPolicy tokenAwarePolicy = new TokenAwarePolicy(childPolicy);
Cluster cluster = Cluster.Builder().AddContactPoint("host").WithLoadBalancingPolicy(tokenAwarePolicy).Build();
ISession session = cluster.Connect("keyspace");

Create an unlogged BatchStatement and fill it with data for a single partition:

PreparedStatement stmt = session.Prepare("...");
BatchStatement batch = new BatchStatement();
batch.SetBatchType(BatchType.Unlogged);
batch.Add(stmt.Bind(...));
...

Create and set the routing keys. This example assumes a compound partition key consisting of a string and an int, and uses my own ‘Util’ class:

RoutingKey[] routingKey = new RoutingKey[] {
  new RoutingKey() { RawRoutingKey = Util.StringToBytes(keyPart1) },
  new RoutingKey() { RawRoutingKey = Util.Int32ToBytes(keyPart2) }
};
batch.SetRoutingKey(routingKey);

For information on how to properly encode different types for keys, have a look in TypeCodec.cs – here are some examples to get you started:

  • Int32 is encoded as a big-endian byte array:
    public static byte[] Int32ToBytes(int value)
    {
    	return new[]
    	{
    		(byte) ((value & 0xFF000000) >> 24),
    		(byte) ((value & 0xFF0000) >> 16),
    		(byte) ((value & 0xFF00) >> 8),
    		(byte) (value & 0xFF)
    	};
    }
    
  • Strings are encoded as a UTF-8 byte array:
    public static byte[] StringToBytes(string value)
    {
    	return Encoding.UTF8.GetBytes(value);
    }
    

It’s a bit hard to tell if your logic is working correctly, especially when you have compound keys of different types. I ended up putting a breakpoint inside TokenAwarePolicy.cs/NewQueryPlan() and validating the chosen node against the output of the nodetool getendpoints command, and I’d suggest you do the same:

nodetool getendpoints keyspace table keyPart1:keyPart2

Whether or not any of this is worth the effort is a whole other question, and my personal experience tells me that it probably isn’t. From my time on a large Cassandra cluster, I consistently get better performance using larger batches of heterogeneous data than smaller batches of partitioned data. In the future I hope to write a whole article on the topic of Cassandra insert performance and back it up with real numbers.

That’s all for now, I hope somebody found it useful. If you have any comments, please leave them below.

Multipart Parsing in Node.js Express 4

Just how, exactly, do you use the multiparty module?

If you’ve migrated to Express 4 you may have noticed that with the removal of Connect middleware, there is no bundled solution for multipart form parsing.  We’ve been playing with the commonly recommended multiparty module and it seems to be a reasonable alternative, with one major problem: the documentation isn’t very helpful.  I consider it a fundamental part of the Node.js experience to go look at a package’s npm page, and within minutes know exactly how to use it based on the docs and helpful examples, and here I was let down.  Hopefully the examples below help to fill in the gaps.

Here’s an example showing how to work with both fields and files from a multipart/form-data post.

var multiparty = require("multiparty");

function(req, res) {

	// ...

	var form = new multiparty.Form({maxFieldSize: 8192, maxFields: 10, autoFiles: false});
	form.on("part", function(part) {
		if (!part.filename)
		{
			return;
		}

		// if we got this far, the part represents a file
		var stream; // = (whatever)

		// ...

		part.pipe(stream);
	});
	form.on("field", function(name, value) {
		// do something with the field
	});
	form.on("close", function() {
		// continue with the rest of your handler
	});
	form.parse(req);

	// ...
}

The key here is that you deal with form field values in the “field” handler, and files in the “part” handler. However, you cannot touch a field part in the “part” handler (with a part.resume(), etc) otherwise the part doesn’t come through properly downstream. In the example above, if we see that the part has no filename, we return immediately and never look back.

Here’s a kitchen-sink example, showing how to (blindly) save all form data to MongoDB while streaming file attachments into GridFS using gridfs-stream. The asynchronous bits between multiparty and gridfs-stream can be a little tricky, so hopefully the example is helpful to someone:

var express = require('express');
var multiparty = require("multiparty");
var async = require("async");
var mongo = require("mongodb");
var MongoClient = mongo.MongoClient;
var Grid = require("gridfs-stream");

function saveRequest(req, options, callback) {
    var context = {
        db: null,
        gfs: null,
        gfsOps: 0,
        request: {
            time: new Date()
            // ...
        }
    };
    async.waterfall([
        function(callback) {
            MongoClient.connect(options.connString, callback);
        },
        function(db, callback) {
            context.db = db;
            context.gfs = Grid(db, mongo);
            callback(null);
        },
        function(callback) {
            var form = new multiparty.Form({maxFieldSize:8192, maxFields:10, autoFiles:false});
            form.on("part", function(part) {
                if (!part.filename)
                {
                    return;
                }
                
                context.gfsOps++;
                var writeStream = context.gfs.createWriteStream({
                    mode: "w",
                    filename: part.filename,
                    content_type: part.headers["content-type"]
                });
                writeStream.on("close", function() {
                    context.gfsOps--;
                    if (context.gfsOps == 0)
                    {
                        callback(null);
                    }
                });
                
                part.pipe(writeStream);
            });
            form.on("field", function(name, value) {
                context.request[name] = value;
            });
            form.on("close", function() {
                if (context.gfsOps == 0)
                {
                    callback(null);
                }
            });
            form.parse(req);
        },
        function(callback)
        {
            var collection = context.db.collection("blargh");
            collection.insert(context.request, callback);
        }
    ],
    function(err) {
        if (context.db)
        {
            context.db.close();
        }
        callback(err);
    });
}

module.exports.router = function(options) {
    var router = express.Router();    
    router.post('/blargh', function(req, res) {
        saveRequest(req, options, function(err) {
            if (err)
            {
                res.writeHead(500);
                res.end();
            }
            else
            {
                res.redirect("/");
                res.end();
            }    
        });
    });

    return router;
}

The issue here is that you need to know when everything is done before you close the db connection. So, we increment a ‘gfs Operations’ counter every time we start streaming a new part into GridFS, and decrement it when the stream closes off. The first handler (either stream-close or form-close) to see that there are no pending operations will invoke the callback to advance. This may not be the best way to handle the problem, but it’s the first thing that occurred to me.

That’s all for now.  If you have any comments, please leave them below.

Expression-Based Security

A pattern for expressive security in a generic CMS

Hello everyone.  I’d like to spend some time discussing a fairly simple but (from what I can tell) unique security feature in Cyanic Business Automation Studio (the platform that powers Cyanic HSE) in the hope that it may help or inspire other developers working with or building solutions with generic CMS systems.

Imagine the following scenario: You are building a work-order management system for a company where office staff have administrative access to all work orders in the system, and they assign jobs to external contractors when required.  Contractors should only be able to see and update jobs that are assigned to them, and it’s not good enough to simply filter results on the page – we need real security that’s enforced down at the data layer.  If you’re using SharePoint, you could easily set up your Content Types and lists to model the data, but you’ll run into a problem with the security model.  SharePoint allows ACLs controlling blanket access to the list, and granular item-level ACLs, but nothing based on expressive rules.  In our example, you’d need some external process, driven by Event Receivers or SharePoint Workflow to continually update item-level access whenever the data changes.  If you’ve ever heard my thoughts on Event Receivers or SharePoint Workflow, you can imagine my opinion on the long-term reliability prospects of such a solution.  From what I can tell, other CMS’s are very much the same.

Of course, you would have none of these problems in a purely custom solution, as your data layer could enforce whatever business rules you need, but if you’re anything like us and trying to make good solutions at a price point that’s affordable to small and medium-sized customers, you really need to be using a generic framework and customizing to fit.  Anyway, we can solve this, and since we’re not using SharePoint, we still have enough will-to-live and intestinal fortitude that we can implement an elegant solution.

 

Some background on Cyanic Business Automation Studio’s data architecture:

All user data in CBAS is stored as JSON records in PostgreSQL (which offers excellent support for working with this kind of data and querying documents in SQL).  All logically related records are grouped into what we call collections which enforce schema constraints (specifically, JSON Schema) on its contents.  Collections are then grouped into realms, which are isolated security environments, each with their own sets of users and roles, typically wrapping up all of the functions and data belonging to a particular organization.  At a high level, the data model looks like this:

CBAS-data-model

A record is represented by a row in the database and along with holding metadata such as when the record was created and by whom, it contains a data column of the JSON type.  As such, we can hold complicated and nested structures with a very simple representation, and we can query those structures with familiar SQL and a minimum of joins.  On a side note: our opinion is that PostgreSQL is a better document store than MongoDB, since we’re still able to benefit from referential integrity, transactions, a powerful query language, and all the other niceties you’ve come to expect from a modern RDBMS.  Here’s what a data record looks like, coming out of our web api:

 

{
  "id": "8bf7c9f3-40bd-45e6-afc6-91b875112c21",
  "createdTime": "2014-04-08T21:08:58.685Z",
  "createdBy": "cf6600f4-57a4-4493-893f-658fb42e9c67",
  "createdByName": "System",
  "updatedTime": "2014-05-12T21:21:05.737Z",
  "updatedBy": "cf6600f4-57a4-4493-893f-658fb42e9c67",
  "updatedByName": "System",
  "data": {
    "WorkToBeDone": "Sprinkler system broken. Does not turn off",
    "TaskLocation": "11639 76 Ave, Edmonton",
    "PhoneNumber": "780-328-2400",
    "EstimatedArrival": "2014-04-09T19:00:00.000Z",
    "EmergencyMeetingLocation": "work van",
    "AssignedTo": {
      "id": "1aead7ed-9661-43e7-b01c-04afd5b8e87b"
    }
    "Start": "2014-04-09T19:14:00.000Z",
    "Tasks": [
      {
        "Task": "turn off valve",
        "Hazards": [
          {
            "Hazard": "trip and fall",
            "Controls": "watch step"
          }
        ]
      }
    ],
    "UseOfGlovesRequired": false,
    "WarningRibbonRequired": false,
    "WorkingAlone": false,
    "WorkingAloneExplanation": "",
    "AllDoorsLocked": true,
    "AreaCleanedUp": true,
    "Incident": false,
    "IncidentExplanation": "",
    "HazardsRemaining": true,
    "HazardsRemainingExplanation": "uncovered hole",
    "HazardsRemainingPhoto": {
      "id": "77084320-4ff2-4c24-8987-477575bdd988",
      "contentType": "image/jpeg"
    },
    "CustomerSignature": {
      "id": "1524b4d9-c0b5-4a8d-b134-4326cf3c51ed",
      "contentType": "image/png"
    },
    "End": "2014-04-09T19:33:00.000Z",
    "TaskLocationCoords": {
      "latitude": 53.514153,
      "longitude": -113.5285366
    }
  }
}

To query user data, Javascript expressions are passed to our data provider (internally, or through our REST web api).  The provider parses these expressions into expression trees (using the excellent Acorn module) and recursively generates parameterized SQL that can be run against the database.  Of course, only certain types of query semantics are supported, and any reference to a property not belonging to the schema, or unsupported expression or parameter type will generate a security fault.

Example: query all in-progress records (those with a start time set, but no end time) that belong to a specific user (in this case, the user’s id is ‘1aead7ed-9661-43e7-b01c-04afd5b8e87b’):

Query: $filter=data.AssignedTo.id == '1aead7ed-9661-43e7-b01c-04afd5b8e87b' && data.Start != null && data.End == null

Generated Expression tree:

{
  "type": "ExpressionStatement",
  "expression": {
    "type": "LogicalExpression",
    "left": {
      "type": "LogicalExpression",
      "left": {
        "type": "BinaryExpression",
        "left": {
          "type": "MemberExpression",
          "object": {
            "type": "MemberExpression",
            "object": {
              "type": "Identifier",
              "name": "data"
            },
            "property": {
              "type": "Identifier",
              "name": "AssignedTo"
            },
            "computed": false
          },
          "property": {
            "type": "Identifier",
            "name": "id"
          },
          "computed": false
        },
        "operator": "==",
        "right": {
          "type": "Literal",
          "value": "1aead7ed-9661-43e7-b01c-04afd5b8e87b",
          "raw": "'1aead7ed-9661-43e7-b01c-04afd5b8e87b'"
        }
      },
      "operator": "&&",
      "right": {
        "type": "BinaryExpression",
        "left": {
          "type": "MemberExpression",
          "object": {
            "type": "Identifier",
            "name": "data"
          },
          "property": {
            "type": "Identifier",
            "name": "Start"
          },
          "computed": false
        },
        "operator": "!=",
        "right": {
          "type": "Literal",
          "value": null,
          "raw": "null"
        }
      }
    },
    "operator": "&&",
    "right": {
      "type": "BinaryExpression",
      "left": {
        "type": "MemberExpression",
        "object": {
          "type": "Identifier",
          "name": "data"
        },
        "property": {
          "type": "Identifier",
          "name": "End"
        },
        "computed": false
      },
      "operator": "==",
      "right": {
        "type": "Literal",
        "value": null,
        "raw": "null"
      }
    }
  }
}

SQL:
SELECT ... WHERE r.id=$1 AND c.id=$2 AND (((d.data->$3->>$4) = $5 AND (d.data->>$6) IS NOT NULL) AND (d.data->>$7) IS NULL)

Where the query params are:

  • $1: <realmId>
  • $2: <collectionId>
  • $3: “AssignedTo”
  • $4: “id”
  • $5: “1aead7ed-9661-43e7-b01c-04afd5b8e87b”
  • $6: “Start”
  • $7: “End”

If you’ve heard me railing against ORMs due to their complexity and loss of control, I offer this pattern as a reasonable alternative. The code required to make this work isn’t large (looking at our query generator, it’s currently about 200 lines of code), is totally dynamic and gives you total control over what semantics are supported and how queries are generated.

 

Security enforcement

Now, how is security enforced on user queries?  The user’s security context is established and is available to every provider in the data layer.  Further, every type of entity in the system has ACLs mapping functional access to principals (either users or groups).  In the case of a collection, an ACL contains the following fields:

  • read: whether or not the principal can read the collection definition.
  • write: whether or not the principal can modify the collection definition.
  • item_read: whether or not the principal can read the records in the collection.
  • item_create: whether or not the principal can create new records in the collection.
  • item_update: whether or not the principal can update existing records in the collection.
  • item_delete: whether or not the principal can delete existing records in the collection.
  • item_read_expr: if item_read is true, an expression to mask the results of a record query.
  • item_update_expr: if item_update is true, an expression which must evaluate true on a record to allow updating of that record.
  • item_delete_expr: if item_delete is true, an expression which must evaluate true on a record to allow deletion of that record.

The ACL expressions are Javascript and like all other expresssions, are schema and form-validated.  These expressions, if defined, are spliced into the main query tree and act as a mask over what would normally be returned by the query.

The overall logic flow then looks like this:

  • Client submits query request to data provider.
  • Provider first checks if the user has access to any ACLs on the collection granting basic item_read access (required at a minimum).
  • Generate a query tree from the main query expression.
  • If there is an accessible ACL expression, parse it into another expression tree and combine it with the main query tree, each subtree connected with an ‘AND’ operator.  If there are multiple ACL expressions, the expressions in the ACL subtree are connected with an ‘OR’ operator.
  • Generate the parameterized SQL for the entire tree, and execute the query against the database.

 

Solution

Getting back to the original problem where contractors can only modify work orders that they have been assigned, we can see that the solution is a single ACL for the ‘work orders’ collection assigned to the ‘Contractors’ role, with the following:

  • read: true
  • write: false
  • item_read: true
  • item_create: false
  • item_update: true
  • item_delete: false
  • item_read_expr: “data.AssignedTo.id == context.userId”
  • item_update_expr: “data.AssignedTo.id == context.userId”
  • item_delete_expr: null

So, regardless of any queries submitted to the data layer by a Contractor, the “data.AssignedTo.id == context.userId” expression must always be satisfied before the record is returned.  Here, the contextual userId (‘1aead7ed-9661-43e7-b01c-04afd5b8e87b’) is substituted into the query at time of generation, but you can implement any kind of contextual support functions.  The limit is solely your imagination, and how big you want your query generators to get.

 

So, that’s basically it.  I hope this post gives someone out there some ideas, and if you have any questions, please post them below.

 

Happy hacking!

Cyanic Automation and Cloud Security

Cyanic Automation Cloud SecurityWe get asked a lot of security questions relating to our cloud-based services (namely, Cyanic Business Automation Studio, the platform that powers our Cyanic HSE management software), and with good reason.  The protection of personal and corporate data is critical, and when considering any software-as-a-service option, it is important to know the policies and practices of those who are ultimately holding your information.

No system can be guaranteed to be 100% secure. This includes not only the cloud-based services we use everyday such as banking, shopping, and government services, but also the physical security of systems and hard copy data on your premises that are subject to theft, fire or natural disaster. Instead of the typical security hand-waving you’re probably used to, I’d like to do an in-depth review of Cyanic Automation’s strategies concerning electronic security, and allow organizations to make informed decisions about Cyanic Automation’s internet-based services.

Warning: technical jargon ahead.

Security Threats and Mitigation

Data Breaches

Data breaches are what most people are concerned of when using cloud-hosted services; specifically, that a remote attacker can exploit a weakness of an application or its infrastructure to gain access to sensitive information. Below is a list of strategies we use to minimize this risk:

  • All Cyanic Business Automation Studio servers are hosted in the Microsoft Azure cloud, providing a large, trusted name to manage both the electronic and physical security aspects of the hosting infrastructure. Binary data is hosted in Azure Storage which is protected by Microsoft-standard protocols.
  • Our database and web servers run on Linux with all but the most critical (HTTP/HTTPS) and secure (SSH) endpoints locked down to external access. Security updates are reviewed and applied on at least a weekly basis, and more frequently in the case of critical issues (such as the Heartbleed SSL issue).
  • Cyanic Business Automation Studio is built on a mature and commodity-level technology stack that powers a great deal of the internet and is supported by a large and responsive community.
  • Cyanic Automation uses industry best-practices around the storage of user passwords (PBKDF2 with random salting and 256-bit keys), meaning that a stolen database will not easily compromise your users’ access to other systems. Further, all cryptographic functions directly use common, industry-standard libraries.
  • All internal data requests are parameterized, removing the possibility of SQL injection attacks.
  • All direct object references go through a single, authenticated API, and ACLs on each object are validated as part of each request. All parts of the web API, including security and function-level access control, are validated by an automated and comprehensive test suite.
  • Off-cloud backup data is never kept on unencrypted storage.

Data Loss

The second main concern of cloud-hosted systems is the potential loss of business-critical data that is stored on those systems. This risk is minimized as follows:

  • As stated above, cloud infrastructure is provided and maintained by Microsoft. The platform provides geo-redundant storage and virtualization which protects against most infrastructure-based causes of data loss (failure of storage media or other system hardware).
  • Our databases perform streaming replication to warm standby servers to provide high availability in the case of a server or rack fault inside the cloud environment.
  • To protect against application-based data loss or catastrophic cloud infrastructure failure, database transaction logs are archived and shipped to off-cloud encrypted storage every 10 minutes, allowing us to rebuild the database to within 0-10 minutes of the failure or loss point.
  • All form-based data in Cyanic Business Automation Studio is internally historized to allow access to previous versions (in the case of user-based data loss) and also to provide auditing in the event of abuse.
  • Cyanic Business Automation Studio allows users to export their data in the form of PDFs, and we highly recommend that our customers routinely use this feature to guarantee access to their data even in the face of extraordinary circumstances that we are not able to control.

Account or Service Hijacking

Even with a secure infrastructure, many systems are vulnerable to user-centric attacks where a user’s network traffic is intercepted and hijacked, or a user’s session is tricked into performing actions from an external source. For the past decade, the OWASP (Open Web Application Security Project) ‘Top 10’ vulnerability list has contained most of the following common attack vectors that continue to plague internet-based applications; the following is our strategy to control these risks:

  • Cross-Site Scripting (XSS): All output is fully encoded through an MVVM framework, preventing markup and data from being treated as executable code on the client’s web browser.
  • Cross-Site Request Forgery (CSRF): All pages served by Cyanic Business Automation Studio include anti-CSRF tokens that are required by all web API requests that modify data or otherwise change the state of the system.
  • Insufficient Transport Layer Protection: Cyanic Business Automation Studio forces all traffic between client and server to use HTTPS/SSL, preventing data and session information from being intercepted on public networks.

Denial/Loss of Service

Denial of service is an attack on a system’s accessibility rather than its data. The direct risk to the end user is that the service or its information will not be available when it is needed. This risk is minimized as follows:

  • Microsoft Azure has a number of basic protection strategies available to hosted systems, primarily throttling network requests when it detects network flooding.
  • Access to Cyanic Business Automation Studio is by customer subscription only, meaning that computationally expensive operations are not available to the general public.
  • As stated previously, we recommend that our customers routinely use the data export features so that data is available offline. Further, if data entry into electronic forms is critical for business functions (such as hazard assessments which must be completed before a job can be started), it is important that workers always have access to paper forms in case the system is inaccessible for any reason including system or network outage.

Other Hosting Options

We believe that our Microsoft Azure-based hosting strikes a good balance between security and the cost-effectiveness of our solution. If your organization has security requirements that are not adequately addressed by our cloud security strategy, or if it is unacceptable to our Canadian customers for their data to reside in US data stores, Cyanic Automation offers two alternative hosting options at an increased cost.

Canadian-Based Single-Tenant Hosting

Cyanic Automation can host a system at a Canadian-based provider which is dedicated solely to your organization, and which does not share data with any other organizations.

On-Premises Hosting

Cyanic Automation can deliver server equipment to be installed and operated inside your organization’s network. We will assist with installation, software updates and back-up strategy; however there will be an increased burden of maintenance on your IT organization.

Conclusion

Cyanic Automation is absolutely committed to protecting your data, and security is our single largest development focus. If you have any questions, please either contact us or leave a comment below.

Software Complexity: A look at .NET and Node.js

Software complexity is killing software development as we know it. I’ve held this opinion for at least the last decade, and I thought it would make a good first post for the Cyanic Automation developer blog.

If we look at how most major development environments evolve over time, it is clear that each new version brings increased complexity due to new language features, new frameworks and subsystems and new tooling. The march is endless and systems such as Microsoft .NET are continuing to grow new arms and legs every day. It’s always good to keep moving forward, but new features usually follow the same pattern: nice, user-friendly developer tools that promise simplicity, yet eventually suck you into a world of limitations, hacks, and a huge burden of knowledge as soon as you try to do something useful with them.

I hate to pick on Microsoft, but let’s have a look at what they’ve given us in recent memory (some of this may not be entirely fair, but I’m trying to prove a point here so please bear with me):

  • Entity Framework: Yet another ORM framework.  Many developers, encouraged by early successes of drag & drop database integration continue down this path and eventually get mired in compromised database structure, leaky abstractions in their data layer, bad query performance and RDBMS lock-in.  I’m not saying EF (or ORM in general for that matter) is bad, but these systems are huge and require a large time commitment to master.  I think we can at least all agree that the effort that has gone into engineering and using these systems just so that we can persist and query data with a bit less code, is pretty insane.  A great article on the topic that has stuck with me over the years: The Vietnam of Computer Science.
  • WCF Data Services: Gives you the ability to create an OData service from any EDM model with just a few mouse clicks!  Unfortunately, exposing your raw relational data probably looks like crap in OData, and if you want to customize the rendering or (god forbid) try to use a non-relational source you’re in for a world of hurt, most likely ending in you writing a custom provider, an IQueryable implementation and bottles of absinthe to dull the pain.  I once spent months of my prime programming years making a custom Data Services provider for a semantic data store, and memories of the experience still fill me with anxiety when my mind wanders to the topic.  The documentation for WCF Data Services is poor to non-existent, the only useful resources being MSDN developer posts that usually have to explore into internal APIs and end with a “happy hacking” directive.
  • ASP.NET Web Forms: Okay, this is going back a ways, but lots of people still use Web Forms, and it’s horrible enough to make the list.  Again, lured by the promise of building dynamic web pages by drag & drop, all Web Forms projects ultimately decay into something sick and twisted and terrible to behold.  One of the worst experiences a developer can have is to come into a Web Forms project cold, expecting to maintain it.  None of the data flow make sense, especially with 3rd party controls that magically communicate back to the server, with crazy eventing models and state management.  MVC usually fares much better in this regard, but is still a large platform to learn and master.
  • Windows WF: Now here’s a way to implement processes through drag & drop (instead of umm, writing code I suppose).  Unfortunately there’s a large burden of knowledge before you can do anything remotely useful with it, and you end up writing a whole lot of code to save yourself from writing small amounts of code.  Maybe I just don’t “get it”, but I’m not sure why developers are constantly discouraged from writing code – it’s what we’ve spent our lives doing, and we’re better at it than anything else.  Windows WF actually makes a lot of sense in SharePoint and it vastly makes that platform more capable; unfortunately, WF has been getting less useful in SharePoint over time, presumably as Microsoft takes away power features so it better fits into Microsoft’s low-trust hosting environments.
  • SharePoint: Here we are.  The mother of all supremely complicated, mis-engineered and confounding software packages.  This usually comes as a surprise to people; after all, SharePoint is becoming ubiquitous, and it’s really quite nice for the end user.  But, you really have to stand back in awe of how quickly it can grind down and crush the hopes of even the most hardboiled of developers.  As soon as you get past the user-friendly façade and try to write code to do anything, even the simplest customization becomes a living nightmare.  Nothing ever works the way you expect it, the APIs are inconsistent and bizarre in places, the documentation is either missing or wrong, none of the error messages or logs are helpful and some features are just plain unreliable (I’m looking at you, workflow!).  There’s so much time spent trying to get even basic things working, you start to wonder if you’re losing your edge – this is an enterprise-level Microsoft product, it must be me, right?  But then you realize, after all the research and Stack Overflow articles from equally suicidal developers around the world, that the whole thing is just completely rotten on the inside, going all the way back to its Vermeer/FrontPage origins.  At some point I’ll have to gather my thoughts and write a whole article about SharePoint, mostly to get it off my chest, but also to serve as a warning to others.  I’ve been developing with SharePoint since the WSS 3.0 days – I’ve eventually learned how to make SharePoint projects succeed, partly through a hard-earned repertoire of magic tricks and secret sauce, but mostly from knowledge about what parts of SharePoint are trouble and staying the hell away from those.

 

Why should you care about any of this?  For a few reasons:

  • The energy level of a software developer can be an amazing, but fickle thing.  A problem that is tough, but adds value when solved will excite and challenge a developer.  He or she will throw their whole being into solving a problem like that, and the end result is productivity and satisfaction.  On the other hand, when working on a problem that is simple but made difficult by the technology being used, a developer’s energy is ground down to nothing, producing mostly waste and frustration.  Morale on the development team is extremely important, and in my experience, technology choices play a role in keeping people happy.
  • Complexity tends to beget more complexity, and can affect entire cultures.  You can see this all through Microsoft, and even worse in the Java community, where people build frameworks for building frameworks, seemingly to feed some kind of masochistic deity that gains strength from the suffering of developers.
  • In today’s big-corporate environment, there just isn’t time to learn and master these large subsystems.  In truth, very little of a developer’s time is spent making things anymore as the burden of processes increases.  Here I’m talking about requirements traceability, quality metrics, stage gate completion checklists, the list goes on and on.  What happens when it’s time to write some code?  Frequently people will stick with ancient or inappropriate technology that they are comfortable with.  Or, they charge ahead with new technology they don’t know much about, but were lured into false security from the few quick tests that worked out well.
  • People tend to become emotionally invested or attached to things that took a lot of time and effort to master, and they become unwilling to make changes even when it makes sense.  There are few industries where a person’s working knowledge is obsoleted so frequently.  It happens all the time in the software development world, especially with companies like Microsoft who tend to lose interest and abandon entire systems (Silverlight, LINQ to SQL, etc).

 

Simplicity and Node.js:

Cyanic Business Automation Studio (the platform that powers our Cyanic HSE product) was initially born out of unwillingness to accept a few key limitations in SharePoint that made it intractable for our needs.  When we started development, the pain of my last big-corporate project was still fresh.  I was working on an alarm viewer application for mobile heavy equipment, a project that was a perfect intersection of difficult legacy frameworks and technologies from years past.  While it all turned out well in the end, it took way longer than it should have, and was overall a mind-shattering, frustrating experience for something so conceptually simple.  I vowed that the next opportunity we had to start a project from scratch, I’d throw away all of my least favorite technologies and start over with something that fits with our principles.  After I used Node.js for a few days, I knew it would be a perfect fit for Cyanic Business Automation Studio.

What is Node.js?  Essentially it is Google’s V8 Javascript engine wrapped up with an asynchronous eventing library, and a helpful and active ecosystem with libraries to do anything you could possibly imagine.  I don’t know very much about its origins or the motivations of the creators, but to me it feels like a wholesale rejection of the idea that software needs to be difficult and complicated.  It’s a system stripped bare, a return to first principles:

  • Here’s a system that you can pretty much learn in a single day, and uses a language you probably already use of a daily basis.  It removes all need to worry about thread concurrency, and yet still achieves huge performance for non-compute server applications.  Its packaging system npm is excellent, and deployment is usually very straightforward.  Oh, and it’s truly cross-platform, and totally free!
  • You get to use the same core language on both client and server.  In addition to being easier on the brain when switching between client and server, you get to share code between them as well.
  • If complexity begets complexity, then the inverse also holds.  Node.js’ first class web frameworks (namely, Express and RESTify) use a single pattern for all request middleware and routing functions, and it gives you huge control over all parts of the web application.  ASP.NET MVC after all of its iterations and OWIN modules still doesn’t come close.  It’s shocking to think that after a few minutes of looking through examples that you can be fully productive – probably 90% of my current Express/Jade knowledge was gained in the first few hours of using it.  Further to this point most of the community contributed libraries are equally simple, a quick look through the standard project ‘readme’ and a few examples are enough to get you going without trouble.

 

How to get started with Node.js:

  • The first step is to buy and carefully read JavaScript: The Good Parts by Douglas Crockford.  This is the Kernighan & Ritchie book for Javascript, and tells you everything you need to know with a minimum of fluff.  If you already know Javascript (like I thought I did), this book does a great job at purging all of your harmful Javascript habits and preconceptions, and getting you to understand its true nature.  Javascript gets a lot of hate from certain developers (such as me, from about a year ago) but I now see it as a beautiful & massively expressive language that is a pleasure to use.
  • Download Node.js and your preferred development environment.  I personally use Node.js Tools for Visual Studio as it gives a similar experience to .NET development, including Intellisense and debugging.  I’m not sure if I’d recommend this at the moment, as the plugin is still pre-release, and still seems to have a minor case of serious brain damage.  I do have high hopes for this project however, and as soon as the bugs are worked out it’s going to be awesome.  Some other options to consider in the meantime:
    • Nodeclipse: I’ve tried it and it seems ok, though in general I much prefer Visual Studio to Eclipse.
    • A good Javascript editor such as Brackets or Atom, and debugging in Chrome through node-inspector.
  • Get familiar with Express, Jade and these high-quality libraries:
    • async: The only module that I consider to be mandatory.  If there’s one problem people complain about with Node.js is that nearly all calls are asynchronous (a good thing), but nested asynchronous callbacks get weird and hairy pretty fast.  Async solves this problem, in its entirety, and has a very positive effect on the readability & maintainability of your code.  You must use it.
    • Moment.js and Moment Timezone: Fills in all of the missing gaps in Javascript’s time handling functions.
    • Winston: A good logging library.

 

While we’re on the topic, a few notes about the other pieces that round out our current technology stack:

  • Postgresql: No journey to simplicity is complete without a migration from SQL Server to Postgresql.  The internet has already said enough about how excellent Postgresql really is, but I’d also like to add that it’s really simple to install, configure and work with.  Installation is usually a one-line command on most Linux distros.  Configuration is typically adjusting 1 or 2 configuration files, even getting write-ahead log archiving to work is really easy.  Everything just works the way you expect, and it also includes support for its awesome json column type.
  • Knockout.js: A client-side MVVM framework.  There are a lot of MVVM frameworks out there, but Knockout.js is really easy to get started with and is as powerful as you need it to be.  Perhaps most important is that it’s unopinionated software and integrates easily into the way I like to do things.
  • jQuery mobile: I’m less enthusiastic about this one, but it more or less works and has not entirely unacceptable performance.  At some point we’ll probably be looking to switch to another web framework.  (Have you had good experiences with another framework?  If so, please leave a comment below, we’re interested in your opinions).

 

Is Node.js all flowers, rainbows and unicorns?  Sadly, not entirely:

  • It’s still Javascript, and Javascript isn’t perfect.  It’s not a panacea, and no doubt other systems will emerge that are better in every way, but it’s a great choice for certain types of projects for today.
  • I’m not sure how well things will hold up for a large codebase and a large development team.  I suspect each project will need a strong-willed and loud-spoken enforcer to review code and make sure people are following conventions and prevent things from getting weird.
  • You do lose compile-time error checking, and some Intellisense fidelity.  That’s quite unfortunate, but I haven’t found it to be that much of a problem.  Plain syntax errors (missing bracket, broken object form, etc) are caught at start-up.  Invalid function calls and similar problems will sneak through, but are easy to fix and generally not the kind of errors I worry about.  In any case, it’s more important than ever to have reasonable automated test coverage.

 

In closing, I’d like to say that we’ve been very happy with Node.js in Cyanic Business Automation Studio.  It gives us the simplicity and power we need, offers great performance and stability, and is very comfortable with CBAS’s dynamic nature.  You should check it out, and see if it can make your job a bit more fun.

 

Happy hacking.