Serverless Manchester

On my own time I’ve been creating a meetup group for the use of Serverless (the framework).

This group was just created and there is no agenda yet for when the first will be done but it’s definitely something I’m working on. Also, there are no sponsors yet but everything will be ironed out in due time, need some content first.

If you’re interested on this or just want to keep in touch and know when the first meetup happens feel free to sign up and to forward this email to whoever you want to. If you know somebody or want to talk on it I’ll very much want to talk to you!

Meetup group: https://www.meetup.com/Serverless-Manchester/

Framework in question: https://serverless.com/

Some good theory about Serverless architecture: https://martinfowler.com/articles/serverless.html

And of course the mandatory Wikipedia article as a start point

https://en.wikipedia.org/wiki/Serverless_computing

How to get a free website hosted (for a year)

Clickbait? Maybe, you tell me. On today’s world, whatever you do, there is no excuse to not have a website, if you have a business, if you’re making a blog, if you want your CV out? There’s many reasons to expose what you have to give to the world. So I’m going to assume you know how to write some HTML and CSS and JS, if you don’t, head to W3Schools and get some nice free education! Because doing an actual website is outside of the scope of this post, let’s just get the boilerplate template from here. In all fairness it’s a very good starting point for a blank website anyways!

Enter AWS

So everybody in the dev world knows AWS is Amazon Web Services and S3 is one of the many (many many) services they have. S3 is used for mostly storing data, however, so many people are now using it for running static websites that AWS is now catering for this as a case. So first, head up to Amazon and create your AWS account.

Create a bucket

Once you’ve signed up on the console, go to Services > S3 and click Create Bucket. Select a region that suits you and give it a name. Leave everything on defaults (we’ll configure it in a bit) and create the bucket.

Once you have your bucket, click on the bucket row and the properties panel will display, take a note of the ARN (Amazon Resource Name) of your bucket. You’ll need it further down!

Create a User for access

Now we need a user to connect to AWS with permissions to send data to our bucket and IAM will help us to do this. Go to Services > IAM. On the menu to the left go to Users > Add User. Make sure you give your user has Programmatic access as we’re going to use it just for API access. Don’t worry about permissions yet, we’ll do that in a second, go the last screen and take note of your Access Key and your Secret Access Key, if you lose the second you can’t see it again and you’d have to create another pair so make sure they’re safe!

Create a policy for accessing your bucket

So now we have a user, let’s create an access policy for our bucket. Policies on AWS are basically a way to describe access rules for an object or a type of object, in this case a bucket vs all the S3 buckets we own. So we stay on IAM and go to Policies > Create policy. Let’s stick to the visual editor as it’s much easier that way. You will need for your policy:

  • Service. S3
  • Actions. This is the permissions you want to give on this policy. Think of it as what the holder of the policy can do on a bucket. It’s useful to think as how you give the policy exactly what it needs and nothing else.
    • List. I usually go for ListBucket though technically you don’t need it.
    • Read. GetObject, all you need is to be able to read an object to know if it’s changed
    • Write. PutObject, this one allows you to write files to the bucket
    • PermissionsManagement. PutObjectAcl and PutObjectAcl, we need these to be able to make files public every time we send them to the bucket
  • Resources. Here you add your bucket ARN so this policy applies to just this bucket. This is a very good way to lock your security and prevent anyone from accessing your data. Needless to say in today’s world this is mandatory! If you add your bucket name, you need to tick Any for the objects so it applies to every object inside the bucket, otherwise your permissions will be too restrictive.
  • Request Conditions. This is the last part of the security, for the time being I will leave it alone, but on of the things you can do is whitelist IPs in this section and make sure your website is only available to a given set of IPs.

Create a Group to use the policy

Click on Review Policy and then Save your changes. Voila! You have an access policy for your bucket. Next up, we’re creating a group, go to Groups on IAM and Click on Create new Group. Give it a name and remember descriptive is good for later (I went for s3-access-to-my-site). On the Attach Policy screen find your access policy and select it, then click on Next step and Save your group. Now click on your newly created group and on the Users tab click on Add Users to Group and add the user your created earlier to this group.

In Review

You now have created a Bucket where you’re going to store your website files. You have a group that has the permissions to upload files to this Bucket and now have a user that has been added to this group. You’re all set! Just need a website now!

Get the AWS CLI toolkit

Now you need your AWS CLI toolkit. This will help you upload your files to S3 on a quick command line. Go to the CLI homepage and get yours, if you’re on a Mac or Linux you can get it easily via terminal and if you’re on windows there’s a nice MSI installer for you, either way is very simple!

Once you have the toolkit installed, it’s time to go back to the notes you took when creating your user for your AWSKey and the Secret Key. Oh you lost them? No problems, just create a new set of keys again.

Now open your terminal and tell aws who you are with the following command:

aws configure

This will ask you for both keys, your default region name (I am using eu-west-1 which is Ireland, if you need to look where your region is go here) and your default output format (json, xml, whatever takes your fancy).

So now we have the toolkit configured, navigate to your folder and synchronise your folder with S3:

aws s3 sync . s3://your-bucket-name --acl public-read 

You should see the files being uploaded and they’ll be on your S3 bucket now. We’re nearly there! Some final touches, so now we want our bucket to host files as a static website, for this we go to the bucket on AWS and click on properties and select the website hosting tab, set our default document and our error document and voila! We’re done.

So where is your website? The format for your website url now is this:

<bucket-name>.s3-website-<AWS-region>.amazonaws.com

So would be something like http://this-is-not-a-real-name.s3-website-eu-west-1.amazonaws.com

Really easy right? Well have a go and let me know how it works for you!

Options and delivery

This is a bit of a change on my usual posts as my job over time has evolved into not just day to day coding but worrying about delivery as well. Something I’ve found out over time that is very easy to forget when estimating or explaining how a team is going to tackle a problem or delivering a feature is providing options.

Most software methodologies these days are aimed at one thing above everything else, deliver quickly. Some say you need to fail quick to know when you’re wrong and same approach when you’re doing well, but everyone agrees that whatever you do, you need to deliver as fast as you can. However, you need to also deliver on a sustainable rate. Enter options. When my team is looking at a new piece of work, one of the fundamental questions we always like to ask is have we given the stakeholders all the options they can have? Keep in mind, some of these options might not even be obvious, but there could be a huge potential win on them!

Imagine you work on a online retailer and you need to incorporate a new delivery courier to your checkout process. Now the obvious thing is to implement the backend API that speaks to your new delivery courier, wrap it all behind some form of facade (chances are you will have to add more) and make an endpoint for your application. So far, you don’t seem to have options. But maybe you do, do you need to implement the facade to talk to the courier API on every endpoint? Maybe you could go for the lowest smallest piece of integration (say only standard deliveries?). This is when options come in handy, as developers, we sometimes forget that when we are delivering a feature it usually responds to a business request and we might not be holding all the cards so options on how to build are always necessary and very important.

Right, we have a backend that might or might not implement all possible ways of doing a delivery, that’s the backend side done, what’s next? Well for the frontend, do you need to build a fully fledged UI showing options and such or can you not just go as simple as show a message when customers pick a product that uses this courier?

There is no recipe here and it all depends on which questions you ask yourself when you are planning, but it is an essential side of delivery that we ask the most fundamental question of all: Is this the absolute minimum that would give you some value? No? Slice as an option, trim the feature a bit more. This way is a fairly easy way to come up with some options every time we are planning what answer to give back to stakeholders and it’s a very good exercise for the whole team to do as you learn more and more about your product.

An open letter to recruiters

Dear recruiters,

Before I say anything I have to admit, I have met some very good recruiters in my professional life, people as passionate about their jobs as devs themselves. If you are reading this you know who you are mostly because you enjoy your work and you feel like you’re helping both the employee and the employer to bond in a relationship that works for both of them. I’m well aware I’m not a recruiter myself nor I intend to tell any of you how to do your work, but considering the state things are today I thought it was worth to voice my ideas and hope somebody will listen to them and hopefully find them useful or in the worse case scenario they’d serve as catharsis for all the complains I’ve had on my head for a while about the recruitment industry.
Now, let’s take a step back for a second, there is nothing more exciting that when you’re looking for a job and you get a reply back from that ad in the company you liked. You do your homework to prepare for the interview, what they do, how they do it, although if it’s that company that you really want, you’ll know all of that already. Now let’s look at it on the other side shall we? You are the recruiter looking to hire the best talent the market has and if you are recruiting in IT, it’s a tough game out there, everybody wants IT people these days, sometimes infrastructure, dev ops, .NET devs (yep, that’s me!) and the list goes on and on. However, pretty much like sending my CV to that company I really wanted to work for or visiting the in laws for the first time, you usually get one shot and here is where my tone might begin to sound like a rant. If I get an email offering me a job in a field I’ve never worked on or expressed any interest in a several years long career, chances is I will think you never read my CV or cared about how good (or awful) I can be on that role, this speaks poorly for you because you are not caring about giving me a role that will work for my career or giving your client a candidate that is what they’re looking for, you just want to give it a shot and see if you get the commission.
As a general rule of thumb most developers in active technology hubs (such as Manchester) tend to receive somewhere in the 10+ emails from recruiters a week. Now I understand that there is a lot of competition for recruiters out there and to be fair, chances are if you are reading this you probably do this anyways, but here are the things I’d suggest you consider before you email a candidate.

Read my profile

Obvious right? It takes about 5 minutes to read a normal sized CV or at least skim through it and it can give you a revealing idea about the candidate. Take me, I did PHP programming about 5 years ago (2011) but I’ve done .NET a lot longer than that and I don’t mention PHP anywhere else on my CV. That should be a hint that I’m not looking for PHP jobs (I’m putting this in bold because I’m dead sure I’ll get a PHP job in my inbox from somebody who “read” the article).

Don’t send several jobs in one mail

Consider a guy emailing you saying they want to apply for a position of junior, mid level developer, senior developer or chief architect depending on what you have available. That raises a lot of flags does it? Well it’s the same way when it gets sent back. I particularly hate the database dump emails that start “I have the following roles available….” shows a lazy approach, please don’t do that, makes you look awful.

If I reply, please do reply back

This is a very old pet hate. When I was looking for a job about a year ago I got several people interested, however when I said that I was working on a Tier 2 Visa a lot of companies do not sponsor visas (it’s getting more common now but still happens a lot). If I took the time to write to you and explain my situation, take 2 minutes to write, Sorry but our client does not sponsor visas, I’ll let you know if that changes, even if we all know that’s probably a lie, but makes us feel like we had closure to the mail.

Don’t just find candidates, find The Candidate

I know this sounds like textbook cliche but I’d be delighted to see the stats (feel free to share and prove me wrong) on how many answers you get on those generic emails sent to 20 or 30 candidates without reading their CV. I trust of a day where we’ll have systems smart enough to choose those good candidates for a job, but right up until then, do a bit of manual filtering and just send it to the real prospects.

Make it personal

This is probably more on the above point, but I’m a lot more inclined to reply an email that is clearly directed to me and you can do that by saying I read this in your blog or I saw your Stack Overflow profile or simple as How’s work in . If I see that you took a tiny bit of time to know about me, I will definitely reply and tell you what I think of the role even if I’m not looking, it’s only polite after all.

Finally

These may all sound a bit vague but they are definitely my pet hates. It’s very difficult to not sound like a ranting old man right now (maybe because that’s what I’m doing) but I think these are the main issues that are giving recruiters the bad name they have. I’ve met some really great ones, so why settle being the spam message sender when you can actually send a quality job to a quality candidate. Think about it.

SQL Server Index fragmentation

Now this was a new one for me…

If you have had to deal with big systems, you know when a table peaks to a couple of million rows and counting, things tend to get a tad slow unless you’re planning on doing some heavy optimizations however, there’s always room to get things to some new extremes when you start handling that much data.

One of the first logical things you check on a database is that you have the correct indexes on each table and if you’re searching over one field, then that field is indexed if it can be so search gets a bit faster. An index is quite the good thing to work with because basically it stores an index (Duh!) of how to access records (rows) on the table, where the numeration on those indexes is precisely that field you chose. This article is a very good and quick way to know how indexes work, if you’re a developer and don’t know, please take 5 minutes of your time to read this, don’t worry, I’ll wait for you…

Back? Very good, give yourself a pat on the back and lets move on. Now, as you may know now indexes are stored in trees and as you add or delete (fundamentally delete) rows you create gaps on the tree that would be way too costly to fix as part of those insert or delete operations and those gaps are called fragmentation. SQL Server being such a fan-fricking-tastic tool as it is (no Microsoft does no pay me) has a way for you to query off a database which of its indexes is fragmented and how fragmented it is.

The key is on the sys.dm_db_index_physical_stats function:

sys.dm_db_index_physical_stats (DB_ID(@'MY_DB'), OBJECT_ID(@'MY_TABLE'), 1, NULL, NULL)

However, that provides information for just one index, so with a bit of SQL Magic, we get this far:

SELECT DISTINCT frag.* FROM
	( SELECT 
			TableName = t.name, IndexName = ind.name, IndexId = ind.index_id, ColumnId = ic.index_column_id, ColumnName = col.name,
			(
				SELECT avg_fragmentation_in_percent
				FROM sys.dm_db_index_physical_stats (DB_ID(NULL), OBJECT_ID(t.Name), ind.index_id, NULL, NULL) AS a
				JOIN sys.indexes AS b 
				ON a.object_id = b.object_id AND a.index_id = b.index_id
				WHERE avg_fragmentation_in_percent >= 0.1
			) AS Fragmentation
	FROM 
			sys.indexes ind 
	INNER JOIN sys.index_columns ic ON  ind.object_id = ic.object_id and ind.index_id = ic.index_id 
	INNER JOIN sys.columns col ON ic.object_id = col.object_id and ic.column_id = col.column_id 
	INNER JOIN sys.tables t ON ind.object_id = t.object_id 
	WHERE 
			ind.is_primary_key = 0 
			AND ind.is_unique = 0 
			AND ind.is_unique_constraint = 0 
			AND t.is_ms_shipped = 0
	) frag
ORDER BY frag.Fragmentation DESC

Whoa! That’s a lot bigger isn’t it? Well, yes, because this one will go through all the indexes on our database and call that function and return everything on a nice little table where you’ll know if there are indexes that should take your attention or not.

Defragmenting the bad ones.

Ok, so you know now which index(es) are misbehaving on your database, now we need to defragment them, so for that we’re going to use a simple ALTER statement and be done with it. To make a quick sample, if we had an index called MY_INDEX_I on a table called MY_TABLE the alter statement would be

ALTER INDEX MY_INDEX_I ON MY_TABLE REBUILD

Now this is a bit extreme because it will drop the index stored and will rebuild the whole thing which can potentially take quite some time depending on your data and the amount of rows you have. There is another way to defragment indexes but because it doesn’t work with SQL Azure and this particular investigation was done for a server over that version, I didn’t looked into it, however the call is not too far. On top of that, this whole rebuild is an overkill for any table with more than 10% of fragmentation, but will always work so it’s a very good safe bet.

Now we have a way to do it manually, however, what happens on a production database with maybe 90 or 100 indexes that gets heavily fragmented every few weeks? You just can’t do it manually in case you were wondering. For that we’re going to use a very handy stored procedure called sp_executesql that lets us invoke dynamically created SQL. All we need is a cursor to go through the indexes and then generate the sql to defragment it and voila.  It goes a bit something like this:

 

DECLARE @MySQL NVARCHAR(4000)
DECLARE @Now DATETIME
DECLARE @Message VARCHAR(255)
DECLARE @TableName VARCHAR(255)
DECLARE @IndexName   NVARCHAR(255)
DECLARE @IndexId   INTEGER
DECLARE @ColumnId   INTEGER
DECLARE @ColumnName   NVARCHAR(255)
DECLARE @Fragmentation FLOAT

DECLARE myCursor CURSOR FOR SELECT DISTINCT frag.* FROM
	( SELECT 
			TableName = t.name, IndexName = ind.name, IndexId = ind.index_id, ColumnId = ic.index_column_id, ColumnName = col.name,
			(
				SELECT avg_fragmentation_in_percent
				FROM sys.dm_db_index_physical_stats (DB_ID(NULL), OBJECT_ID(t.Name), ind.index_id, NULL, NULL) AS a
				JOIN sys.indexes AS b 
				ON a.object_id = b.object_id AND a.index_id = b.index_id
				WHERE avg_fragmentation_in_percent >= 0.1
			) AS Fragmentation
	FROM 
			sys.indexes ind 
	INNER JOIN sys.index_columns ic ON  ind.object_id = ic.object_id and ind.index_id = ic.index_id 
	INNER JOIN sys.columns col ON ic.object_id = col.object_id and ic.column_id = col.column_id 
	INNER JOIN sys.tables t ON ind.object_id = t.object_id 
	WHERE 
			ind.is_primary_key = 0 
			AND ind.is_unique = 0 
			AND ind.is_unique_constraint = 0 
			AND t.is_ms_shipped = 0
	) frag
ORDER BY frag.Fragmentation DESC
OPEN myCursor

FETCH NEXT FROM myCursor INTO @TableName, @IndexName, @IndexId, @ColumnId, @ColumnName, @Fragmentation

WHILE @@FETCH_STATUS = 0
BEGIN
	SET @Now = GETDATE()
	SET @MySQL ='ALTER INDEX ' + @IndexName + ' ON [' + @TableName + '] REBUILD'

	SET @Message = 'Reindex: ' + @IndexName + ' ON ' + @TableName + ': ' + CONVERT(NVARCHAR(28), @Now, 21) + ' (' + @MySQL + ')'
	PRINT @IndexName + ' ' + @TableName + '(' + CONVERT(NVARCHAR(28), @Fragmentation, 21) + ')'

	EXEC sp_executesql @MySQL

	FETCH NEXT FROM myCursor INTO @TableName, @IndexName, @IndexId, @ColumnId, @ColumnName, @Fragmentation
END

CLOSE myCursor
DEALLOCATE myCursor

I know right? But well, this works like a charm and because it’s fully automated, you can just call it from any of your systems as a cron job and no need to worry about fragmented indexes anymore. Did you tried it? Well, give me a shout and let me know how you got on.

DTO’s and why you should be using them

If you’ve worked in any form of modern (decent sized) application, you know that the de facto standard is to use a layered design where people usually define operations into layers corresponding to certain functionality, for example a Data Access Layer, that is nothing else but an implementation of your repository using nHibernate, Entity Framework, etc. While that is a very good idea for most scenarios, a bit of a problem comes around with it, and is the fact that you need to pass around lots of calls between layers, and sometimes is not just calling a DLL inside your solution, sometimes, it’s calling a service hosted somewhere over the network.

The problem

If your app calls services and receives data from them (obviously?) then you might encounter in your service something like this:
public Person AddPerson(string name, string lastName, string email)
Now, let’s first look at the parameters and why this is probably not a very good definition. 
In this method, you have 3 arguments, name, lastName and email; what happens if somebody needs a telephone number? Well, we just add another argument! Dead easy! Yeah, no. Suppose we make it more interesting saying we have Workers and Customers, both inheriting from person, we would then have something like this:
public Person AddWorker(string name, string lastName, string email)
public Person AddCustomer(string name, string lastName, string email)
If you need to add that telephone number now and go for that extra param, you have to add code in two locations, so you need to touch more code, and what happens if we touch more code? Simple, we put more bugs.

The Good

Now, what happens if you have this?
public&nbsp;Worker&nbsp;AddWorker(Worker worker)
public&nbsp;Customer&nbsp;AddCustomer(Customer customer)
DTO stands for Data Transfer Object, and that is precisely what these classes do, we use them to transfer data on our services. For one, code is much simpler to read now! But there is another thing, if Worker and Customer inherit from Person as they should considering they are both a Person, then we can safely add that email to the person without having to change the signature of the service, yes, our service will now have an extra argument but we don’t have to change our service signature on the code, just the DTO it receives. 
Now, more on the common use for DTO’s, just as Martin Fowler states a DTO is
An object that carries data between processes in order to reduce the number of method calls.
Now, it’s fairly obvious that using DTOs for input arguments is good, but what happens for output arguments? Well, similar story really, with a small twist, considering that many people today use ORMs for accessing the database, it’s very likely that you already have a Worker, Customer and person class, because they are part of your domain model, or they are created by Linq To Sql (not a huge fan, but many people still use it), so, should you be using those entities to return on your services? Not a very good idea and I have some reasons for it.

One very simple reason is that the objects generated by these frameworks usually are not serialization friendly, because they are on top of proxy classes which are a pain to serialize for something that outputs JSON or XML. Another potential problem is when your entity doesn’t quite fit the response you want to give, what happens if your service has something like this?

public Salary CalculateWorkerSalary(Worker worker)

You could have a very simple method just returning a double, but let’s think of a more convoluted solution to illustrate the point, imagine salary being like this:

public class Salary
{
     public double FinalSalary {get;}
     public double TaxDeducted {get;}
     public double Overtime {get;}
}

So, this is our class, and Overtime means it’s coupled to a user because not everybody does the same amount of overtime. So, what happens now if we also need the Tax code for that salary? Or the overtime rate for the calculation? That is assuming these are not stored on the salary table. More importantly, what happens if we don’t want whoever is calling the API to see the Overtime the Worker is doing? Well, the entity is not fit for purpose and we need a DTO where we can put all of these, simple as that.

The Bad

However, DTOs are not all glory, there is a problem with them and it’s the fact they bloat your application, especially if you have a large application with many entities. If that’s the case, it’s up to you to decide when a DTO is worth it and when it’s not, like many things on software design, there is no rule of thumb and it’s very easy to get it wrong. But for most of things where you pass complex data, you should be using DTOs.

The Ugly

There is another problem with DTOs, and it’s the fact you end up having a lot of code like this:

var query = _workerRepository.GetAll();
var workers = query.Select(ConvertWorkerDTO).ToList();
return workers;

Where ConvertWorkerDTO is just a method looking pretty much like this:

public WorkerDTO ConvertWorkerDTO(Worker worker)
{
    return new WorkerDTO() {
        Name = worker.Name,
        LastName = worker.LastName,
        Email = worker.Email
    };
}

Wouldn’t be cool if you could do something without a mapping method, like this:

var query = _workerRepository.GetAll();
var workers = query.Select(x =&gt; Worker.BuildFromEntity&lt;Worker, WorkerDTO&gt;(x))
                   .ToList();
return workers;

Happily, there is a simple way to achieve a result like this one, and it’s combining two very powerful tools, inheritance and reflection. Just have a BaseDTO class that all of your DTOs inherit from and make a method like that one, that manages the conversion by performing a mapping property to property. A fairly simple, yet fully working, version could be this:

public static TDTO BuildFromEntity&lt;TEntity, TDTO&gt;(TEntity entity)
{
    var dto = Activator.CreateInstance&lt;TDTO&gt;();
    var dtoProperties = typeof (TDTO).GetProperties();
    var entityProperties = typeof (TEntity).GetProperties();

    foreach (var property in dtoProperties)
    {
        if (!property.CanWrite)
            continue;

        var entityProp =
            entityProperties.FirstOrDefault(x =&gt; x.Name == property.Name &amp;&amp; x.PropertyType == property.PropertyType);

        if (entityProp == null)
            continue;

        if (!property.PropertyType.IsAssignableFrom(entityProp.PropertyType))
            continue;

        var propertyValue = entityProp.GetValue(entity, new object[] {});
        property.SetValue(dto, propertyValue, new object[]{});
    }

    return dto;
}

And Finally…

The bottom line is like everything, you can over engineer your way into adding far too many DTOs into your system, but ignoring them is not a very good solution either, and adding one or two to a project with more than 15 entities just to feel you’re using them, it’s just as good as using one interface to say you make decoupled systems.

What’s your view on this? Do you agree? Disagree? Share what you think on the comments!

EDIT: As a side note, it’s work checking this article that talks a lot about the subject.

Empower your lambdas!

If you’ve used generic repositories, you will encounter one particular problem, matching items using dynamic property names isn’t easy. However, using generic repositories has always been a must for me, as it saves me having to write a lot of boilerplate code for saving, updating and so forth. Not long ago, I had a problem, I was fetching entities from a web service and writing them to the database and given that these entities had relationships, I couldn’t retrieve the same entity and save it twice, so I had a problem.
Whenever my code fetched the properties from the service, it had to realize if this entity had been loaded previously and instead of saving it twice, just modified the last updated time and any actual properties that may had changed. To begin with, I had a simple code on a base web service consumer class like this.

var client = ServiceUtils.CreateClient();
var request = ServiceUtils.CreateRequest(requestUrl);
var resp = client.ExecuteAsGet(request, "GET");
var allItems = JsonConvert.DeserializeObject&lt;List&lt;T&gt;&gt;(resp.Content);

This was all very nice and so far, I had a very generic approach (using DeserializeObject<T>). However, I had to check if the item had been previously fetched and one item’s own identity could be determined by one or more properties and my internal Id was meaningless on this context to determine if an object existed previously or not. So, I had to come up with another approach. I created a basic attribute and called it IdentityProperty, whenever a property would define identity of an object externally, I would annotate it with it, so I ended up with entities like this:

public class Person: Entity
{
    [IdentityProperty]
    public string PassportNumber { get; set; } 
    
    [IdentityProperty] 
    public string SocialSecurityNumber { get; set; }

    public string Name {get; set}
}

This would mark all properties that defined identity on the context of web services. So far, so good, my entities now know what defines them on the domain, now I need my generic service consumer to find them on the database so I don’t get duplicates. Now, considering that all my entities fetched from a web service have a Cached and a Timeout property, ideally, I would have something like this:

foreach (var item in allItems)
{
    var calculatedLambda = CalculateLambdaMatchingEntity(item);
    var match = repository.FindBy(calculatedLambda);

    if (match == null) {
        item.LastCached = DateTime.Now;
        item.Timeout = cacheControl;
    }
    else {
        var timeout = match.Cached.AddSeconds(match.Timeout);
        if (DateTime.Now &gt; timeout){
            //Update Entity using reflection
            item.LastCached = DateTime.Now;
    }
}

Well, actually, this is what I have, but the good stuff is on the CalculateLambda method. The idea behind that method is to calculate a lambda to be passed to the FindBy method using the only the properties that contains the IdentityProperty attribute. So, my method looks like this:

private Expression&lt;Func&lt;T, bool&gt;&gt; CalculateLambdaMatchingEntity&lt;T&gt;(T entityToMatch)
{
 var properties = typeof (T).GetProperties();
 var expresionParameter = Expression.Parameter(typeof (T));
 Expression resultingFilter = null;

 foreach (var propertyInfo in properties) {
  var hasIdentityAttribute = propertyInfo.GetCustomAttributes(typeof (IdentityPropertyAttribute), false).Any();

  if (!hasIdentityAttribute)
   continue;

  var propertyCall = Expression.Property(expresionParameter, propertyInfo);

  var currentValue = propertyInfo.GetValue(entityToMatch, new object[] {});
  var comparisonExpression = Expression.Constant(currentValue);

  var component = Expression.Equal(propertyCall, comparisonExpression);

  var finalExpression = Expression.Lambda(component, expresionParameter);

  if (resultingFilter == null)
   resultingFilter = finalExpression;
  else
   resultingFilter = Expression.And(resultingFilter, finalExpression);
 }

    return (Expression&lt;Func&lt;T, bool&gt;&gt;)resultingFilter;
}

Fancy code apart, what this does is just iterate trough the properties of the object and construct a lambda matching the object received as sample, so for our sample class Person, if our service retrieves a person with passport “SAMPLE” and social security number “ANOTHER”, the generated lambda would be the equivalent of issuing a query like

repository.FindBy(person =&gt; person.Passport == "SAMPLE" &amp;&amp; person.SocialSecurityNumber == "ANOTHER")

Performance you say?

If you’ve read the about section on my blog, you’ll know that I work for a company that cares about performance, so once I did this, I knew the next step was bechmarking the process. It doesn’t really matter the fact that it was for a personal project, I had to know that the performance made it a viable idea. So, I ended up doing a set of basic tests benchmarking the total time that the update foreach would take and I came up with these results:

Scenario Matching data Ticks Faster?
Lambda calculation Yes 5570318 Yes
No Lambda calculation Yes 7870450
Lambda calculation No 1780102 No
No Lambda calculation No 1660095

These are actually quite simple to explain, when no data is available, the overhead of calculating a lambda, makes it loose the edge because no items match on the query, however, when there are items matching the power of lambdas shows up, because the compiler doesn’t have to build the expression tree from an expression, but instead, it will receive a previously built tree, so it’s faster to execute. So, back into the initial title, empower your lambdas!
If you have any other point of view on these ideas, feel free to leave a comment even if you are going to prove me wrong with it because I’ve always said that nobody knows everything, so I might be very mistaken here. On the other hand, if this helps, then my job is complete here.

Common method for saving and updating on Entity Framework

This problem has been bugging me for some time now. One of the things that I miss the most from NHibernate when I’m working with EF is the SaveOrUpdate methods. Once you lose that, you realize just how much you loved it in the first place. So, I set out to make my EF repositories to use one of those. My initial approach was rather simple and really close to what you can find here or here, so I basically came out with this:

public T SaveOrUpdate(T item)
{
 if (item == null)
  return default(T);

 var entry = _internalDataContext.Entry(item);

 if (entry.State == EntityState.Detached)
  if (item.Id != null)
   TypeDbSet.Attach(item);
  else 
   TypeDbSet.Add(item);
 
 _internalDataContext.SaveChanges();
 return item;
}

This is a neat idea and it works for most of the cases, with one tiny issue. I was working with an external API and I was caching the objects received on my calls and since these objects had their own keys, I was using those keys on my DB. So, I had a Customer class, but the Id property was set when I was about to insert and since our method uses the convention that if it has an Id, it was already saved, then the repo would just attach it to the change tracker but the object was never saved! Boo! Well, no panic, my repo also has a method called GetOne which receives an Id and returns that object, so I added that into the soup and got this:

public T SaveOrUpdate(T item)
{
 if (item == null)
  return default(T);

 var entry = _internalDataContext.Entry(item);

 if (entry.State == EntityState.Detached)
 {
  if (item.Id != null)
  {
   var exists = GetOne(item.Id) != null;

   if (exists)
    TypeDbSet.Attach(item);
   else
    TypeDbSet.Add(item);
  }
  else 
   TypeDbSet.Add(item);
 }
 
 _internalDataContext.SaveChanges();

 return item;
}

Now, if you think about it, how would you update an object?

  • Check if the object already exists on the DB
  • If it’s there.. update it!
  • If it’s not there.. insert it!

As you can see, Check involves GetOne. Now, if you are thinking that you don’t want an extra DB call, there is always a solution…

public T SaveOrUpdate(T item, bool enforceInsert = false)
{
 if (item == null)
  return default(T);

 var entry = _internalDataContext.Entry(item);

 if (entry.State == EntityState.Detached)
 {
  if (item.Id != null)
  {
   var exists = enforceInsert || GetOne(item.Id) != null;

   if (exists)
    TypeDbSet.Attach(item);
   else
    TypeDbSet.Add(item);
  }
  else 
   TypeDbSet.Add(item);
 }
 
 _internalDataContext.SaveChanges();

 return item;
}

Granted, is not fancy, but gets the job done and doesn’t requires many changes. If you pass the enforceInsert flag, means you are certain that the object you’re saving requires an insert, so it will have an Id, but you know is not there. Just what I was doing!

Do you have any other way of doing this? Do you think this is wrong? Feel free to comment and let me know!

Consuming web services and notifying your app about it on Objective C

Since almost the beginning of my exploits as an iOS developer I’ve been working on several apps consuming web services and one big problem has been notifying different areas of my app that certain event has been updated. My first genius idea was to create my own home brew of notifications using the observer pattern. It wasn’t all that bad, but then a while later I realized that I was reinventing the wheel, so I resorted to the one and only NSNotificationCenter.

Enter NSNotificationCenter

According to Apple on the docs for the notification center, this is the definition:

An NSNotificationCenter object (or simply, notification center) provides a mechanism for broadcasting information within a program. An NSNotificationCenter object is essentially a notification dispatch table.

So, this was my observer! How does it work you say? Let’s get to it! But before, let’s get into context. What I have is a class called ServiceBase which is the base class (duh!) for all classes consuming services. The interface definition for the class looks a bit like this…

 @interface ServiceBase : NSObject<ASIHTTPRequestDelegate>
  - (void) performWebServiceRequest: (NSString*) serviceUrl;
  - (void) triggerNotificationWithName: (NSString*) notificationName andArgument: (NSObject*) notificationArgument;
  - (NSString*) getServiceBaseUrl;
 @end
 

The class has been simplified and the actual class has a few other things that depend more on how I work, but you get the point. However, given the idea of this post, I’m going to concentrate more on the notification side of the class. However, we do need to get some sort of example here going on and to get that done, let’s take a look on the performWebServiceRequest method.

- (void) performWebServiceRequest: (NSString*) serviceUrl
{
    if (!self.queue) {
        self.queue = [[NSOperationQueue alloc] init];
    }
    
    NSURL *url = [NSURL URLWithString: serviceUrl];
    ASIHTTPRequest *request = [ASIHTTPRequest requestWithURL:url];
    [request addRequestHeader:@"accept" value:@"text/json"];
 
 [requestion setCompletionBlock: ^{
  //this will keep the self object reference alive until the request is done
  [self requestFinished: request];
 }];
 
    [self.queue addOperation: request];
}
 

Now, we have this simplified method that creates a request, sets the requestFinished method as the completion block and queues up the request. Now, I said I would focus on the notifications, but one thing to consider here:

 [requestion setCompletionBlock: ^{
  //this will keep the self object reference alive until the request is done
  [self requestFinished: request];
 }];
 

Keep in mind, that this sentence will preserve the reference to self until the request is finished, so it’s not autoreleased by ARC, however, the way I use services on my app, each service works as a singleton (or quite close to that) and keeping the reference is not a problem because you are not creating one new instance of each service class every time you make a request. This also solves an issue with ASIHttpRequest loosing the reference to the delegate before the service is complete, however, that’s a story for another day. Now, moving on the the end of the request…

- (void)requestFinished:(ASIHTTPRequest *)request
{
    JSONDecoder* decoder = [[JSONDecoder alloc] init];
    NSData * data = [request responseData];
    NSArray* dictionary = [decoder objectWithData: data];

    for (NSDictionary* element in dictionary) {
  [self triggerNotificationWithName: @"ItemLoaded" andArgument: element];
    }
}
 

When the request is finished, it will only convert the data received, notice that this is a simple scenario, and make a notification that an Item has been loaded using the [triggerNotificationWithName: andArgument] method. Now, into the actual notification method…

- (void) triggerNotificationWithName: (NSString*) notificationName andArgument: (NSObject*) notificationArgument
{
    NSNotificationCenter * notificationCenter = [NSNotificationCenter defaultCenter];
   
 if ( notificationArgument == nil )
 {
  [notificationCenter postNotificationName: notificationName  object: nil];
 }
 else
 {
  NSMutableDictionary * arguments = [[NSMutableDictionary alloc] init];
  [arguments setValue: notificationArgument forKey: @"Value"];
  [notificationCenter postNotificationName: notificationName  object:self userInfo: arguments];
 }
}
 

Now, we only need to subscribe to a notification and retrieve the value which is very simple, take this example inside a UIViewController:

- (void) viewDidLoad
{
 NSNotificationCenter * notificationCenter = [NSNotificationCenter defaultCenter];
 [notificationCenter addObserver: self selector: @selector(authenticationFinished:) name:@"AuthenticationCompleted" object: nil];
}

- (void) itemLoadedNotificationReceived: (NSNotification*) notification
{
 NSDictionary* itemLoaded = [notification.userInfo valueForKey: @"Value"];
    // Do something with the item you just loaded
}
 

In the itemLoadedNotificationReceived method the app will receive a notification when each item is loaded. This may not be the best example, because when you’re loading several items, they normally go into a cache to be loaded from a UITableView afterwards, but this idea should get you going.

Do you use a different approach? Do you normally use it like this? Well, if you have anything at all to say, feel free to leave it in the comments!