I am now participating in an amazing project – a full-time, one-month coding school in C language, offered by a new educational institution called 42Prague. I’m having such a great time I actually took an unpaid leave just to stay. I will write more about them and that experience later.
Given that I’ve already programmed in C#.NET and that I’ve been in IT for over 15 years, I should progress rocket fast. But the opposite is true, because I am also a Senior QA / Software Safety & Reliability / Safety Engineer by trade. And that comes with strings attached.
I tend to write code which is somewhat an overkill for the task, with inputs sanitization, watchdogs, size checks. I write automated tests. And I dig deep, maybe too deep. Which is how I lost several hours of dev time developing automated tests and then investigating…
Why the heck does the C standard function strcpy() from the string.h library return different result than my implementation of the same thing in a fringe scenario?
The task at hand at the 42Prague exercise was easy enough – or so it seemed:

What does the man page for strcpy says – this time, in a role of a Binding Customer Requirement?

Resp.

So the pointer to char array to dest buffer serves dual role:
- it directly copies the input to this dest using pointers, i.e. already within the strcpy() function or it’s alternative implementations
- it returns the same dest as the return value after the function is done copying to it
That is exactly what the Doc says about the return value:

Resp.

This is, hence, also how I wrote my feeble re-implementation. Naturally, I then proceeded with diff-testing the C string.h strcpy() reference implementation against my exercise. Yet to my surprise, on some of the more edge scenarios, the results have differed significantly.
Note the Tests 5 thru 7:

GNU C was consistently returning the first character of the source, when it shouldn’t have returned anything if it adhered to the specs!
I tried everything I could think of, but one fact was clear: regardless of what I do, there is no way! I could replicate the observed behavior of C string.h strcpy() if I were to return the pointer to dest char array. No way.
My confusion and suspicion grew. Finally, I looked up the source code of the original <string.h> library and the strcpy() function within. And what was my surprise once more!

The original does not adhere to the documentation.
It does not return the pointer to dest.
Instead, it returns a helper variable tmp – the same which I initially had and then removed from my code to be faithful to the requirements/doc!
I’ve wasted absurd amount of time on this, but in the end, I stand vindicated. And as always, I learned a lot in the process – and that’s what counts:
- no program in world adheres to the requirements or specs or doc 100% faithfully – so I shouldn’t expect such magic from FOSS either.
- when my code looks like a correct implementation of some requirements, I need to judge it by those requirements alone – not by another reference implementation
- if there’s a difference between output of my code and the reference implementation, and if the reference source code is available, I shouldn’t waste time and investigate the other source code immediately
- I need to timebox even diffs which may look important, if they’re from such fringe/edge scenarios the customer (in this case, 42School’s evaluator program called Moulinette) will not care.
Will I submit this as a bug to GNU? Probably not. They cannot change the implementation because of backwards compatibility, which is next thing of sacred in established codebases. The maximum they could do is append the man pages with a note that the native implementation doesn’t actually strictly adhere to the documented return value. And there is not much value in such doc fix in terms of effort:value ratio.
So in the end, this is just a feedback to my own firmware, offering more lessons learned to build on in the future.
Leave a Reply
You must be logged in to post a comment.