Wednesday, May 18, 2016

Log literacy -- a sadly neglected skill

People who come to this blog for the technical posts may have noticed that those rather dried up over the last couple of years. There was a highly boring reason for this -- having built a little guerilla build server with one of my colleagues, it suddenly took over the world, despite our best intentions. This meant an unexpected amount of DevOps work keeping it running, and continuing to develop what had been first intended to be just a system for our team into one that covered multiple sites in different geos.

Fortunately, success was eventually rewarded by another team writing a competing system, to which my response was "Thank you, very much!" and I have at last been able to shed that particular monkey from my back. In time, that period will become a source of many war stories, but there is one particular thing that I did notice.

The key "new thing" in this system was to chain together the individual component builds that were the elements provided to the formal build process, so we could get to system-testable outputs more quickly, in a manner that could work equally on the build server and the developer's local machine. This, of course, meant that developers would at times see their builds fail in components which they were not familiar with, and then come to the two of us juggling the server (and who might be equally unfamiliar with the failing component) for a resolution.

Some of the time it turned out to be some corner case we hadn't considered in our orchestration process, but often enough it was something environmental that precipitated an otherwise normal failure in the MSBuild based process. It's just that a normal failure ends up with yards of error messages, redoubled by being screaming red text when it's in a console window on your desktop. Something like:

\Library4.csproj(##, #): error XXX####: Something went wrong
Done Building Project "D:\Source\Feature\path\to\component\Library4
\Library4.csproj" (default targets) -- FAILED.
Done Building Project "D:\Source\Feature\path\to\component\Library3
\Library3.csproj" (default targets) -- FAILED.
Done Building Project "D:\Source\Feature\path\to\component\Library2a
\Library2a.csproj" (default targets) -- FAILED.
Done Building Project "D:\Source\Feature\path\to\component\Library2
\Library2.csproj" (default targets) -- FAILED.
Done Building Project "D:\Source\Feature\path\to\component\Library1
\Library1.csproj.metaproj" (default targets) -- FAILED.
Done Building Project "D:\Source\Feature\path\to\component\Component
.sln" (default targets) -- FAILED.
Done Building Project "D:\Source\Feature\path\to\component\Build\Buil
dAll.proj" (FullBuild target(s)) -- FAILED.


[105 lines of other errors omitted]

"D:\Source\Feature\path\to\component\Build\BuildAll.proj" (FullB
uild target) (1) ->
" (default target) (2) ->
csproj.metaproj" (default target) (15) ->
ibrary2\Library2.csproj" (default target) (16) ->
ibrary.2a\Library2a.csproj" (default target) (18) ->
\Library3.csproj" (default target) (20) ->
\Library4" (default target) (19:2) ->
\Library4.csproj(##, #): error XXX####: Something went wrong

    0 Warning(s)
    7 Error(s)

This, it turns out, is enough to make even many quite senior developers throw their hands up in horror, when it happens in an unfamiliar piece of code -- even though, a few hundred lines earlier, there will usually be an obvious root cause, maybe something quite blatant, like:

  PowerShell.exe -File "D:\Source\Feature\Path\to
\component\Scripts\Do-Something.ps1" [arguments]
  The argument 'D:\Source\Feature\Path\to\component\Scripts\Do-
Something.ps1' to the -File parameter does not exist. Provide the path to an existing '.ps1' file as an argument
to the -File parameter.

which simply wasn't emitted as an error or warning in the MSBuild meaning of the terms, and so wasn't highlighted, or re-iterated, but which would be the start of what would become a cascade of FAILED and Error red text.

This phenomenon of a failure log ending with things going horribly wrong, but with a seemingly innocuous line much earlier that indicates when things started to go bad is not just restricted to MSBuild logs, or even build logs in general.

And that is the nice case. There can be worse. Sometimes the error messages at the end may be entirely unrelated to what actually failed or at best be only tangentially related (e.g. tidying operations failing when what they are supposed to tidy didn't get created) -- so trying to reason from them can be a hiding to nothing.

So when confronted by a failure log that seems to conclude with notification that the sky is falling for no good reason, take a deep breath, refuse to panic, and just start at the top. You might not always be able to resolve the issue personally, but you'll most likely be able to provide a better problem report to the person who can.

Post a Comment