Anécdotas sobre cobertura de la prueba

«No quieres 100% de cobertura, porque es realmente muy caro lograrlo. En cambio deberías enfocarte en asegurarte la cobertura en los lugares donde más frecuentemente tienes bugs«.

Este es un extracto de una charla que tuve con Stef mientras transitábamos por la ruta que lleva de Lens a Lille, en Octubre de 2010. Si bien puse el texto entre comillas, la cita no es textual, pues Stef hablaba en inglés, pero la idea central es la que reflejo aquí.

Para cerrar este post, les comparto una situación que viví hace un tiempo: resulta que una de las personas que estuve capacitando hizo una demo de la aplicación que desarrolló como parte de la capacitación. La demo venia bien y en un momento le pedimos que mostrara una funcionalidad particular que no habia sido mostrada. Resulta que cuando la va a mostrar, la aplicación ¡pincha!, ja!  a lo que le pregunto ¿y los tests? adivinen….no había tests para esa funcionalidad. El porcentaje de cobertura de la aplicación no superaba el 40%. ¡Que mejor forma de ilustrar la importancia de estas cuestiones! Espero que este colega haya aprendido la lección.

Por último quiero agradecer a David Frassoni quien la semana pasada me dio la idea de escribir este post.

Algunas ideas sobre cobertura de la prueba

Ayer recibí una consulta sobre este tema y estaba convencido que ya tenia algo escrito al respecto. Me puse a buscar me encontré con este post que había escrito hace ya más de un año, pero nunca había publicado no recuerdo por qué.

Continuando con este post que hice hace un tiempo, hoy quiero compartir algunos pensamientos. Personalmente creo que es importante tener un alto grado de cobertura, pero no hay que perder de vista que la cobertura sólo indica que el código ha sido ejercitado, pero nada dice de cuan bien (o cual mal) ejercitado. Precisamente hace unos dias Carlos me compartió este artículo donde se hace especial incapié en este punto.

Como lo indica este artículo que compartí anteriormente, existen distintos tipos de cobertura. Puntualmente la forma de cobertura que yo describí entra en lo que se describe como statement coverage y que es lo que mide la gran mayoria de las herramientas.

Pero esto sólo, es una herramienta insuficiente y si uno no analiza los resultados con criterio, podria llegar a engañarse facilmente.

Es cierto que de no tener  pruebas, a tener pruebas que ejerciten el 80% del código, es una mejora muy importante, pero de ahí a «confiarnos» en la calidad de nuestro código, hay aún un trecho interesante por recorrer y posiblemente sea el trecho más complejo. Es más, dado el esfuerzo que puede implicar ese 20% restante, puede que resulte más conveniente, concentrarse simplemente en las partes del código que suelen tener acarreados más defectos.

Continuará…

User Stories vs. Casos de uso

Es común que en una primera aproximación tienda a verse las user stories como análogas a los casos de uso del Proceso Unificado, en el sentido que ambos artefactos describen en cierto modo una funcionalidad del sistema. Personalmente creo que esta analogia no es apropiada, ya que mientras un caso de uso es efectivamente una especificación de un requerimiento, la user story podría a lo sumo el título de dicho requerimiento.

Es más, en un punto podríamos decir que las user stories son en su espíritu contrarias a los casos de uso: mientras que los casos de uso pretenden contemplar los detalles del requerimiento para que el programador pueda realizar una implementación completa y correcta del requerimiento, las user stories son intencionalmente vagas pues lo que buscan es promover el diálogo entre quien debe implementar la funcionalidad y quien la ha requerido.

Continuous Delivery, una visión de alto nivel

No quiero entrar en detalle sobre definiciones, beneficios e impedimentos relacionados a esta práctica. Creo que hay suficientes fuentes de información en la web al respecto (con solo googlear el término continuous delivery encontraremos alrededor de 18.000.000 resultados).

Asumiendo que el lector ya está familiarizado con las definiciones básicas, quiero compartir mi visión de esta práctica.

Un concepto central en la práctica de continuous delivery es el denominado Deployment Pipeline. El mismo modela el proceso de la organización para materializar la implementación de una idea/necesidad del negocio. Dos puntos a destacar:

  • Hablamos de organización y no de equipo, porque este proceso involucra varios sectores de la organización más allá del equipo de desarrollo.
  • Hablamos de materializar una idea/necesidad de negocio. No basta con que el software esté desarrollado, la necesidad no se resuelve con el código dentro de un repositorio o en un paquete .zip, sino con el software corriendo en un ambiente donde pueda ser utilizado por el cliente.

La primera parte de este este Deployment Pipeline debe darlo el equipo de desarrollo con la implementación de la práctica de integración continua, lo cual puede llegar a ser un interesante desafío si el equipo trabaja sobre código legacy (entiéndase legacy=sin pruebas). Aquí es importante destacar que incorporar la práctica de integración continua va mucho más allá de poner un Build Server a monitorear el repositorio, compilar el código y ejecutar los tests.

La segunda parte del Deployment Pipeline es la que involucra a otros sectores de la organización, pues es la parte que recorre los distintos ambientes hasta llegar a producción.

Backlog management Tips

In this post I want to share some tips about this topic.

Backlog basics (or preliminary definitions)

Some people tend to think that a backlog is just a simple list of items. I don’t think that is true; I try to use the word backlog to refer to a list of items that are prioritized and estimated. Of course, the prioritization is done by the customer while the estimation is done by the technical team.

What are the items in the backlog? Depending on the context, the items can by user stories, use cases or simply tasks. When working on a development project I try to avoid having tasks in my backlog because in many cases tasks do not add value to the customer who prefers user stories. In most cases, what really adds value to the customer is working software and that is represented by a group of user stories.

The backlog is commonly used to organize work but is also used to provide visibility of the project’s status .

Backlog recommendations

Make the status EXPLICIT

A very common strategy to show the status in a consumable way is to use a semaphore pattern (green-yellow-red): Done, In progress, Blocked, Pending. Using this convention is very easy to detect smells. For example, if there are many yellow stories it could mean that there is too much work in progress, and you are not focused on completing specific stories.

Note: if you are using a dashboard, then the status is given by the position of the item in the dashboard.


Limit the work in progress

Completed items are the only real measure of progress, so you should focus on completing items instead of accumulating stories in progress. This leads us to the following question: How many stories can be in progress at the same time? There is no single answer; it depends on the length of time, but each person should work on ONE item at a time. In a particular situation it could be that two items are closely related, so you could decide to work on them both at the same time. But if you have 3 team members and 10 items in progress, you should look at the situation.

INVEST in SMART

When working with user stories, it is good practice to keep the INVEST acronym in mind:

  • Independent
  • Negotiable
  • Valuable
  • Estimable
  • Small
  • Testable Testable

Beyond user stories, I think these considerations are very useful when creating a backlog, no matter what your items are.

Other famous acronym related to planning and very used when defining goals is SMART:

  • Specific
  • Measurable
  • Achievable
  • Relevant
  • Time-boxed

More information about these acronyms can be found here.

My backlog

The following picture is a screenshot of the backlog of my current project. I have highlighted some important properties:

Game development, episode #2: some more concepts

Before drilling down into technological stuff I want to share some basic concepts.

Collision detection is a very common concern in games. It consists on determining whether two objects have come into contact with one another. In games, this is necessary in order to make decisions, for example, in games like Mario Bros., it is important to know whether a Mario has collide with a tortoise or a coin. In order to detect collisions, each object  has a bounding box, that is an approximation of  the object’s surface.

sprite is a a small bitmap image (or set of images) used to visually represent . Another kind of bitmap is a tile, which is usually used for background maps.

During the drawing process all the sprites and tiles are drawn incrementally into a buffer and then this buffer is dump at once into the low level buffer that will impact on the screen. This technique es called double buffering.

To be continue…

Game development, episode #1: the gameloop

I suspect that most of the readers of this blog are programmers, so I must start by saying that game development is a broad activity that includes several areas and not only the programming. Of course that I am a programmer so the focus on this series will be programming but I will also try to provide some resources related to the other activities. So, after this brief comment it’s time to switch to the main topic of this post: the gameloop.

From the programming point of view the main component of any game is the gameloop. This loop performs several tasks that can vary depending on the platform your game run on but conceptually these tasks can be summarized in 2:  execute game logic and drawing.

Of course we can drill down into these tasks and get a more detailed game loop like the following:

while ( not gameover and user not exits)
{
  process user input
  execute AI
  resolve collisions
  draw graphics
  play sounds
}

The gameloop is main component of any game but not the only one, there are some other components that are not part of the game loop: stuff like the game menu and the settings screen among others.

Object creation review

When working with OO languages we use constructors to instantiate our objects. Constructors should always return a valid instance ready to use, you don’t need to perform any additional task before using the instance. In some occasions an object need some collaborators to perform its work and in that cases you will need to provide a constructor that takes this collaborators as parameters.

The syntax of constructors may vary depending on the language, in C++like languages (Java, C# and others) constructors take the same name of the class. In cases like Object Pascal (the language behind Delphi) you use the keyword constructor to declare constructors. And finally  we have Smalltalk-like cases, where you constructors are simply class methods, no special name , no keyword, just call them as you like.

One practice I like when defining constructor is to give them some semantic, this is a very common practice for those who work with Smalltalk but it is not so common for those working with C++ like languages. Many times when I work with C# or Java, I find classes that provide several constructors with different parameters, but is not so clear how this constructors work, of course al of them return a valid instance, but do they perform any additional task? is there any particular property in the create instance? Let’s see an example to illustrate it. Let’s define a person class with first name (mandatory) and last name (optional).

In C# we could have something like this:

public class Person
{
public Person (string firstName) {…}
public Person (string firstName, string lastName) {…}
}

And to use it:

Person aPerson = new Person("John");
Person anotherPerson = new Person("John", "Foo");

In Smalltalk this could be:

Object subclass Person
Person class>>withFirstName: aFirstName.
Person class>>withFirstName: aFirstName andLastName: aLastName.

And we use it this way:
aPerson := Person withFirstName: 'John'.
anotherPerson := Person withFirstName: 'John' andLastName: 'Foo'.

As I said, this practice is very common in Smalltalk, but it can be used with C++like languages, in fact it is recommended by Kent Beck in his book Implementation Patters that is focused on Java. Let’s refactor the C# person class to use this pattern.

public class Person
{
public static Person WithFirstName(string firstName) {...}
public static Person WithFirstNameAndLastName(string firstName, string lastName) {...}
}

( I know, Smalltalk’s ability of merging method name and parameters provides a great user experience that is impossible to emulate with C++ like languages. )

Claim-based identity (part#2)

Before diving into the ws-* protocols I mentioned in part#1, is important to review two important concerns about exchanging messages over the net:

  1. How can I be sure that the message I get has not been read or modified along the way?
  2. How can I be sure about who sent a message?

Point (1) can be solved by signing the message and point (2) by encrypting it. In the following paragraphs I will explain this topcis in a simplify way

Note: the following concepts can be applied in several different ways to resolved the mentioned situations.

Message Signing

Let’s suppose that Endpoint A needs to send a message to Endpoint B.

Endpoint A starts by taking the message and applies it a hash function (typically MD5) (step1) and the result of that is encrypted using A’s private key (step 2) obtaining a signature for the message. After that message is ready to be sent along with its signature (step 3). When endpoint B gets the complete message, it starts by separating the message itself from the signature and applies the hash function to the message (step 4) obtaining a hashed message. At the same time B decrypt the signature using A’s public key (step 5) and as a result of the decryption is should get the hashed message. If the result is not same that are to possibilities: the message has been modify or it was sent by someone else other than A.

image

This way, endpoint B can be sure that the message was sent by A and that the message has not been altered.

Message encryption

Now supposed that A needs to send a confidential message to B.

To ensure the message to be read only be B, A encrypts the message using B’s public key (step 1) and then put it on the wire (step 2).  When B gets the message it can decrypt it using its own private key (step 3).

image

By combining these two techniques we can ensure the integrity and confidentially of the message, in other words: only the endpoint know about the data of the message (because is encrypted) and the receiver of the message can be sure about  the recipient and content of the message (because it is signed).

To be continue…

DDD: implementing object’s identity

One important thing to consider when implementing a domain-driven design is object identity.

In most languages, each object identity is determined by the memory address of the object, but if we are implementing a DDD then we shoud redefine object identity in terms of our domain. For example, let’s suppose we are working in a bank domain, and we have an Accout class, then you should define the identity of Account objects based on the account number.

When working with C#, there are four methods related to the identity of objects, all of them defined in the root class Object:

public virtual bool Equals(object obj);

public static bool Equals(object objA, object objB);

public virtual int GetHashCode();

public static bool ReferenceEquals(object objA, object objB);

Let’s analyze them one by one.

bool Equals(object obj)

By default this method will compare object’s memory address, but that is not correct when implementing a DDD. As mentioned before identity of our domain classes should be define in tern of domain concepts, so we should override this method. Continuing with the Account class example, this method should compare account number: same account number then same object.

public override bool Equals(object obj)

{

Account otherAccount = obj as Account;

if (otherAccount == null)

return false;

return this.Number.Equals(otherAccount.Number);

}

static bool Equals(object objA, object objB)

This method is static and just invokes Equals method of instance objA passing to it the objB as a parameter.

int GetHashCode()

This method is not directly related to object’s identity. It is used when a hash code is needed to represent the object, for example if we want to store the object in a hash-based struture like a hashtable. From msdn: » The default implementation of the GetHashCode method does not guarantee unique return values for different objects. Furthermore, the .NET Framework does not guarantee the default implementation of the GetHashCode method, and the value it returns will be the same between different versions of the .NET Framework. Consequently, the default implementation of this method must not be used as a unique object identifier for hashing purposes.»

When implementing a DDD we should override this method in our domain classes to return different values for not equal objects. In the Account class example we could use the account number as the hash code.

public override int  GetHashCode()

{

return this.Number;

}

static bool ReferenceEquals(object objA, object objB)

This method simply compares memory addresses, the same behaivour that Equals default implementation.

Well, this is it, I hope this post to clear enough.

If you want to see some tests running to validate this and make your own test you can download my code from here.

In future posts I will write about the relation of this with the equals operator (==) and the importance of equals method when working with persistence frameworks like NHibernate.