Search This Blog

Monday 17 August 2015

Working with Word documents using Aspose in C#

The current system we are building is a platform for document authoring and collaboration. The system lets users upload word documents into our system and the API in our system extracts content from it and translates it into a custom document format. Our current code base does this by parsing the word Open XML content and translating it into the content elements of our document element. The code largely revolves around dealing with the nuances of the Open XML format and conditionally handling inconsistencies of Open XML. Looking at some metrics on the code that parses the word document, there is code with very high cyclometric complexity.

This is inevitable if we built this all by ourselves, but in the grand scheme of things working in a start-up this cost us at least two peoples time for a couple of months. Is this worth it and will our code be as good as some of the libraries out there in parsing word documents. The answer is an obvious NO. We didn’t need to build this on our own. At approx 14000$ I could get a Site OEM license of Aspose.Words for NET and use it.

Overview of Aspose Words for .Net

A snapshot of the capabilities of Aspose.Words for NET from their site is shown below

Having evaluated a few options, it was an easy conclusion to use Aspose. The product has a mature API and can convert it into other formats quite easily. The picture below from Aspose should explain everything you need to about formats of content Aspose supports.

Importing Word Content

Extracting content from different document elements is made easy by Aspose’s Document tree navigation and composite nodes. One of our developers shared the code snippets shown below on how to extract paragraphs , content from tables and footers from a word document. 

Extracting from Paragraphs and footer:

Extracting content from tables:

Producing word documents from Data using Templates

A feature we need was to take our custom document format, extract data and produce word documents from word templates. The mail merge feature in Aspose is pretty slick in how we are able to do this without much effort.

Before we started writing any code we created a word document template (as shown below in the screenshot) to identify what data needs to be injected into the template.The code snippet blow is from one our developers, who worked with the library more extensively than I did.  Aspose has a concept of regions to dynamically grow portions of the document, such as tables. Since we persist the output word document in a file system, we converted the output into a stream object.

So when we execute the code to perform a merge of the data and the template, the resulting word document looks like the following screenshot.

Clearly this is a feature to buy and not to build on our own, because no matter how good we are, the cost of building this is going to exceed a full blown version of Aspose.Words for NET. The Aspose.Words for NET library helped us avoid a lot of complex code that would have been written to match these requirements. We are no experts in Open XML and frankly don’t think we should be writing code to parse word documents. Aspose.Net for Word was an easy choice. They have other components which are worth having a look at.

Friday 3 January 2014

VS2013 Extensions Gallery

I am a bit lazy when it comes to using VS extensions. Don't think I bothered exploring what sits in the Extensions and Updates gallery for a while, shall i say never?.. and TBH I am both amused and surprised at some of the stuff there

Amused : Umbraco New Project template (chuckle)

Surprised: F Sharp , MVC 5 , Web API 2 templates, Nancy Templates, VSClojure .. Wow . support for clojureScripts in VS, there is probably one for R coming in the near future watch out hahah

The let down - templates for Exception Class, Struct, .. come on 8 versions out in .Net we still need templates for this ?

There are companies using the extensions gallery as a way of luring developers into buying components / products via . Nice marketing..  There are plenty like Aspose, Cocos2D, however the obsessions with EF is never ending and the need for so many templates with diff versions is very questionable..MVC probably comes next in the list

This again does not include the numerous nuget packages created for various project templates. I think I have got to raise my hat to all those guys who have taken the time and put in the effort to create these templates/ nuget packages.. Except the ones for Singleton, Struct and Exception classes :)
Jokes apart there is a quite a bit of useful stuff there which i clearly have missed.

Wednesday 27 February 2013

Building Agile Teams – Not sure?


In the process of adopting agile methodologies, some organisations don't really pay attention to how these teams are formed. Forming a team for a project is different from forming teams in organisation. The latter needs a lot of thought, compromise and patience. owing to the needs of the business/client. I don't understand why organisations bother adopting agile without this view. It is all done for that client/product/project who is important, yes that's the primary objective but then that doesn't cut it when you mess with the peoples minds when we constantly moving them around teams.  I would go as far as saying at least 30-40% of this effort is wasted,  every time a team is put together they go through the same cycle of storming, forming ,norming and performing. I really think this is the most underutilised concept from Hoffman, It is used as punch lines on slides with little attention to the consequences of the process. Teams formed should not be broken without valid reasons (sorry multiple projects and cost are not good enough reasons, let me know if you have other reasons, I know I have a few)

At this point you will be thinking this is just the usual crap people talk about. Here is a hint if you use the word resource for a team member, you have never paid attention to the do’s and don’ts listed below. The change in perception that's needed is, that a skilled technologist in one context is not productive or efficient in another. The context here is that of a product , project or service being worked on. This context is never constant in IT and all you can do is keep the team constant, (even your machines are changing with updates everyday). Can two moving parts in a system bringing stability unless the forces negate each other? Are there exceptions not sure…

1. Don't build teams of specialists.

When companies organise teams by discipline such as testers, analysts , designers, and developers, All they have done is create silos of specialists who are not effective in how they work together as a software development team. The dynamics between individuals with in a team are quite different to the ones who work in teams founded on disciplines. That directly reflects on how efficient they are and how productive they are. These teams based on disciplines create the biggest hurdle. How will you align the goals of these teams based on discipline with the goals of a project they working on ? It doesn't make sense because they achieve neither.

2. Don't share people unless they are specialists.

The whole theory that people in a team founded on a discipline are specialists is not true. You become a specialist by virtue of doing something valuable in a project or system. Do not share the basic functional roles of analysts, testers and developers between projects, it is not as efficient on cost as you perceive it to be.There used to be a time where i used to acknowledge the cost benefit of sharing certain roles between projects, but increasingly in recent times I have lost faith in this idea. To be honest it does more harm and costs you more (not accounted or measured most times) than any real cost benefit. I don't think the human mind is capable of treating two goals with equal priority if a person is shared between teams, If ever you wanted to do this, it may be achieved in a cross functional team, where a specialist is shared and the team has a bunch of all-rounder's.

3. Build Cross Functional teams

Building software has been split into so many functional roles these days, it is beyond me now to understand why.  Is it difficult for someone to be able to do a bit of analysis, and testing along side developing code. I am not asking them to be a specialist in those disciplines neither am I asking them to change there careers. I personally feel quite proud of being able to do more than one functional role. By being cross functional all that is expected is you should be able to do other kinds of work required for the team so you keep the work flowing on your Kanban. But I have noticed teams increasingly blocking work and team members not picking a card that needs to move further to done. This may also be compounded by managers feeling the need for specialists only to do that work. Many a specialists have screwed up and I am witness to it, so what is troubling teams to encourage new bees/ all rounder's to take a chance. There is always the specialist to come back and review it?

4. Encourage common goals for the teams.

Career progression and teams based on disciplines have murdered the whole system and have pampered individuals into thinking that it is normal to be able to work in one functional area and it is not there responsibility to do anything else that is required to achieve the common goal. Are they specialists in there own functional area ? I doubt this theory because a majority of them are not really any kind of experts they do just about enough to keep it going and make just about the same mistakes in judgement as other team members in other functional roles. The corporate annual review system has completely brain trained people to think that they need to meet there personal goals before the team goals (very few places have these). Since this is tied up to some kind of bonus , the motivation is pretty high to get at least your personal goals right. This system in my view is the worst evil in most organisations and is a passive killing machine.

5. Don’t encourage Big Teams

Any team more than 8 – 10 is big. By building teams larger than this size you end up increasing overheads. This further translates into team members beginning to feel they don't get enough information and time to function properly. This leads to lesser visibility on the flow of work and pushes some members of the team to delegate work and in some extreme cases to micro manage,

6. Don’t use Agile if team is working as a resource pool

Agile cant work with this model where multiple people work in  multiple projects as if they were a single team. So don't force agile to be used if your resource management is based on this model, and ask why agile doesn't work. Best not to use agile in this model.

Wednesday 6 February 2013

An analogy for CI–Books

Not sure what i wanted to write about but there is this crazy idea in my head about an analogy i need to pour out somewhere..

Over the years of working and practicing agile development practices in my not so big a career, I have come to find various analogies in the real world for the agile development process. I am not going to say I live and breathe Agile but I would like to think I am extremely passionate about it. Over the course of time. When I coach a team I take a lot of interest in using real life analogies and as a result have some for myself when I look at the practices I follow.

Analogy by definition is an inference or argument you draw from one particular context to that of another context, some are used to make an argument while some are used to enhance the context. The one analogy I repeatedly try to seek context from is that of a well authored book (rather published ). I may or may not be right to draw this analogy but in the context of this article for release management and deployment tools, I was hoping my analogy will allow me the emphasize the importance of release continuous integration.

The concept of a well authored book falls in line with the various parts of our agile development process.

The first step of an agile development process is release planning and management process, this is like the preamble of the book , where the author lists all the people involved in the writing of the book and anyone who has contributed to the book in terms of literature , these people can be compared to the stake holders or the users of the software being developed. The preamble gives an insight into how the book took its current form and the release management processes allows the vision of the users to manifest into working software in the agile development process.

Now continuous integration can be compared to that of the index of a book, where the index gives you a quick overview of the contents of the book, the index lists the specifications of the software , in an indexed manner and should a reader want to quickly get some information it is available in terms of the page numbers in the index.

The chapters of the book are the iterations in the development cycle, where each chapter lists the various scenarios in the software and explains the various aspects of the software being built, these are detailed specifications of the software. As a user sees the software being built in every iteration each chapter in the book is completed and finally the book takes its form as a result. The index in the mean while is continuously enriched by the key points of the chapters being written. The index is like the snapshot information of the book as it is being built, and so is the continuous integration environment for the development of software. A CI (continuous integration) environment is merely something that builds software and runs a few tests for young agile teams but as teams grow and continuously improve they start using CI for various other purposes like reporting project health, metrics and in some case also deployment. There comes the appendix Smile with tongue out just kidding .

Rightly or wrongly I drew this analogy but I dont really care at this point how meaningful this post is, I just wanted to write about this crazy analogy I had on my way home

Wednesday 30 January 2013

The Agility Grid in organisations

We all work in an industry which is called Information Technology where information is the most under-rated commodity. We have structures in place which were not modelled for this industry. We have been retrofitting our operational and management models based on principles of other industries. The irony of all this not all of us are from the era that this started nor are we from other industries. A majority of us started our careers in this industry and yet we have some of the most rigid organisations and now in the last 5 – 10 years we have been trying too hard to make these structures agile and flexible.
Organisations have structures, some horizontally aligned and some vertically aligned. When organisations embark the path of using/enabling agile practices (which is the trend at the moment in IT), they tend not to reorganise themselves to enable this. A very typical observation has been that a lot of change happens from middle management to the operational teams (in IT development teams).
I think at this point it is important to reiterate that being agile is not just a following good software development practices and processes and cannot be purely achieved by fixing one business unit. It is really an outcome of collaboration of several business units in the simplest possible way, so that when they work collaboratively the outputs of each business unit enable other business units to work efficiently with agility in an iterative way. The iterative nature needs to be reflected in how they hire people, how they budget, how they specify products and also how they sell them. Purely developing a product iteratively is only the first step. Reflecting on this and modelling your business around it is the optimum path for a business which is agile
Here is my theory which i have been pondering with for a while in my head, it is in no way a pro matrix management theory, however it does seem to present hints that it may be the way to go if planned and executed with care.
The Grid
An organisation is a grid effectively, where each line has people and resources which produce forces. These forces operate (upwards/downwards) or (left/right). At all the intersections of the forces on the grid, the forces align/oppose each other to enable flow of information and change. Where forces oppose each other you will find resistance and inability to produce results , where forces support each other you will find progress and agility.
No I am not preaching matrix management but I am trying to apply forces in physics to analyse the situation and find out which organisation can adapt to agility better. I am not sure everyone will agree but the majority of organisations have decision makers along the vertical lines and enablers on the horizontal lines. This is purely based on what i have observed than anything in some ways my interpretation
In horizontal organisations, there tend to be more enablers than decision makers. This is fine so as to make progress and internal changes flow smoothly. However the ability  to react to changes i.e external forces (could say Porters 5 forces) is reduced. This is inevitable due to a reduction of or slower decision making process. So are we saying this is not going to work. No, on the contrary seems like a horizontal organisation is an easier one to model and adapt to agility. A combination of collaboration practices that balance internal forces and and how information is relayed and used make it a more viable change. Collaboration is not just about how people work with each other it is also how information is relayed and how it is consumed in the organisation. Horizontally structured organisations with good communication channels and a democratic decision making process could adapt to agility more efficiently.
So where does this leave vertical organisations, obviously based on the grid theory above there tend to be more decision makers than enablers, This is fine however progress is limited as there are less enablers, You may find that organisations which compete on market share generally fit in here, they tend not to be the innovative ones, even if they did it might be worth looking at the product life cycle of these organisations and see how short the life span of there products are. P&L’s may tell otherwise, but look into these organisations and you will find that they have a tinge of diversifying into multiple markets or industries. A relatively high ratio of  decision makers,tend to put these organisations in the mode of reacting to incidents than change of any form. This is obviously because there are lesser enablers with a reduced capacity to induce a positive change or any form of improvement. This organisation structure is the the toughest nut to crack. However this is the most common structure in the industry.
Management structures reflect this aspect more than any other form.This I believe is a consequence of modelling our management structures along the lines of the manufacturing industries, compounded by factors such as culture, control and power. This reflects itself in governance and operations. I am not going to elaborate any further but I simply feel the manufacturing line model of management from the 80’s no longer suits the 21st century organisation which is aspiring to be agile.
Move on Management 3.0 is that the answer, not entirely , there is more to an organisation than management, We haven't even skimmed other business functions such as innovation , sales, marketing , strategy etc .. These aspects need to operate differently, which comes back to validate my original conclusion that it is not enough to just change a few business units to achieve agility, you have more on your hands than you think you do..

Tuesday 16 October 2012

Teamcity – Automate to find Broken Links on your website

If you work on a website , one of the smoke tests for your website could be to check that no links on your site are broken. And if the CI server can do this on every check-in there is nothing better than that as you can keep an eye it for every change. Further repurposing such a test as a smoke test for every deployment or even for monitoring purposes is extremely useful productive - good luck doing this manually  (When done on Live make sure there is a filter in Google Analytics for your test machines to be excluded)

Off the many tools available out there, the application at seems to fit the bill for CI, Supports several modes see docs for more info

Download it and install it on your development machine

To run the test locally (should install linkchecker-8.1 on the machine), you could run the following command eg."C:\Program Files (x86)\LinkChecker\linkchecker.exe" --file-output=html/report.html --recursion-level=-1

This has to be installed on all TeamCity agents. To hook up a build add a build step as below which is equivalent to the command above


This will allow you to hook a build on Teamcity to run a test which spiders through your site checking for broken links.

The command above produces a report with the name linkchecker-report.html in the agents working folder. If you push this html as an artifact and add a report tab on the Teamcity server called “Link Checker Report” for the linkchecker-report.html artifiac, you should be able to see the report after every build is run

For broken links the build will fail as linkchecker will return a code 1 on the command line, for success builds it returns a 0

See below for an example report which uses report tabs on teamcity to show the report


Tuesday 20 September 2011

Effect Mapping to manage products

Just been to this talk by Gojko on Product Management using Effect Mapping. It is a technique which is useful for high level project visualisation. It is very similar to mind mapping technique where stakeholders, users and teams colloborate on project scope.

It helps reduce scope of wish lists and helps teams focus on business goals by asking the questions Why?, Who?, How? and What? in this orders

Why? Allows you to narrow down to the business goal. This is the centre piece of your effect map from where all other discussions should start and reason upon

Who? It is not the user but it is who can cause the desired effect to achieve the business goal. In most cases these are project/ product stakeholders.

How? For each stakeholder , identify how the target group can achieve or obstruct the desired effect in real life and not in terms of software, these should effectively be stakeholder needs

What? For each stakeholder identify what business activities or software capabilities would support the needs of the stakeholder. These become your epics in the product backlog

At the end of the effect map both the stakeholders and team should be able to see the synergy of the business goals and what needs to be achieved.

For more see, Gojko’s white paper on this see

Some advice from people who have used this are..

  • Getting the right number of people can be a challenge
  • Staying focussed and at the right level of detail is important
  • Ensuring enough focus on the how is important
  • Keeping everyone away from solutionising is a real big challenge when technical people are involved
  • Ideal group size of 5-8 when working for a time box of 2-3 hours
  • Will be of immense value to the business

This technique is not necessarily something for agile projects you could use it even for waterfall projects

A useful tool I have found which you can use for this is at Check it out its pretty handy. Even if you are a developer striving to do something on your own , if you put your ideas on a mind map it will help you visualize the idea. Smile