scanf() cookbook

I have been thinking scanf() awkward because I tried to understand scanf() as opposite to printf(). That was wrong in many ways.

  • In scanf() format, a space is not necessary between two “%<conversion>”s. For example, format “%d%d” matches input “100 200”.
  • In scanf() format, a space matches any number of any white space characters, such as tab and newline. For example, format “%d %d” matches input “100\n200”, format “%d\n%d” matches input “100 200”.

Once I learned that, scanf() started working as I expect. Here is some howto.

How to read data over newline:

 input:
 100
 200
 program:
 scanf("%d%d", &i1, &i2);
 printf("%d %d\n", i1, i2);
 output:
 100 200

scanf() does not care about the notion of line. “\n” is just one of the
white space character.

How to read data with leading white space:

 input:
  100
   200
 program:
 scanf("%d%d", &i1, &i2);
 printf("%d %d\n", i1, i2);
 output:
 100 200

“%<conversion>” skips the leading spaces.

In C++, istream (such as cin) skips the leading spaces in the same way.

How to skip until end of line:

 input:
 100 101 102
 200 201 202
 program:
 scanf("%d%*[^\n]", &i1);
 scanf("%d%*[^\n]", &i2);
 printf("%d %d\n", i1, i2);
 output:
 100 200

“%*” throws away the matched string.

 another way:
 scanf("%d", &i1);
 while (getchar() != '\n');
 scanf("%d", &i2);

How to read various floating numbers:

 input:
 1 0.1 1e-2
 program:
 scanf("%f%f%f", &f1, &f2, &f3);
 printf("%f %f %f\n", f1, f2, f3);
 output:
 1.000000 0.100000 0.010000

“%f” can read many formats.

How to skip lines that begins with comment character:

 input:
 # This is a comment.
   # This is a comment.
     # This is a comment.
 string-data
 program:
 while (scanf(" #%[^\n]", s));
 scanf("%s", s);
 printf("%s\n", s);
 output:
 string-data

” ” before “#%[^\n]” matches zero or more spaces.
If ” #%[^\n]” doesn’t match, no input is consumed.

Working with slow machine

We programmer have to wait for the response from the machine. When the machine is slow, such as, slow simulation or long compilation, it disturbes our rhythm.

During the waiting time, I tend to think what to do next, but most of the time I just wish the compilation ends sooner. Wishing is not my job, and know that I’m not doing right. Knowing that makes me stressed up.

So first thing I do is to stop seeing the progress. By looking at the scrolling screen,
I tend to go into wish mode, or spend time looking for warnings in the scrolling screen.

Unless you are like a chess master, you can prepare only one step ahead. Preparing two steps ahead takes more effort and sometimes turn out to be wrong.

So I do something else instead; the best thing is something unrelated, such as unrelated reading or short stretching. Sounds inefficient? But I found that I tend to accomplish more than I expect at the end of the day, and I get less stressed.

Title roll

In a movie, names of the important actors are shown at the beginning because knowing important actors helps the audience to appreciate the movie. Minor actors are not listed there because they don’t matter. The same thing applies to the local variables: You should list up the variables of important role at the beginning of the function to help readers of the function to understand it easier. And minor variables should not be listed here. For example,

    void search(.....) {
        int first;
        int last;  
        int mid;    
    .....
    }

Having those variables in mind (with appropriate names, please) helps to read the function. Sometimes knowing those variables makes the rest of the function even obvious.

One exception: In C, casual iterators may have to be listed at the top. This is due to the flaw of this language. However, I sometimes do something like this to avoid listing ‘i’ in the local variable list.

    {
        int i;
        for (i = 0; i < .....) .....
    }

Or

    { int i; for (i = 0; i < .....) {
          .....
    }}

In this way, the part of the program becomes independent, allowing you to cut and paste anywhere.

You can safely ignore

It sounds like professional way, probably because of the word “safely”.

– You can safely ignore the warnings.
– You can safely ignore the obsolete comments.
– You can safely ignore this part of the code which never executed.

Saying such is one of those amateur attitude. If there is something that can be really safely ignored, just remove it before showing to other people.

Best bringup tool

When I bring up a bare board, I have to fight without OS or libraries, or sometimes even without RAM. Most conservative solution in such case is ICE. But I do not like it because

– ICE is expensive.
– ICE is no use once debug monitor is up.
– ICE is usually hairy and buggy.

I use simple print to the serial I/O console because, to me, the most important information is where the program is and its status. In the boot code, the first thing I do is to initialize the serial I/O, and print at least one character, which shows the board got out of reset and started fetching instructions.

The boot code goes like this:

    #include "mydebug.h"

    // Initialize serial I/O.
    la         t0, SIO_BASE
    li         t1, .....        // baud rate, enable, etc.
    li         t2, .....
    sw         t1, PARAM1(t0)
    sw         t2, PARAM2(t0)

    // Show the first message.
    la         t0, SIO_TXFIFO
    li         t1, 'H'
    sw         t1, (t0)
    li         t1, 'i'
    sw         t1, (t0)
    li         t1, 0x0d
    sw         t1, (t0)
    li         t1, 0x0a
    sw         t1, (t0)

    // Test mydebug functions.
    MYPRINTS("MYPRINTS: Hello world.\r\n")
    MYPRINTS("MYPRINTD: ")
    MYPRINTD(1234567890)
    MYCRLF()
    MYMARKER()

The output looks like this:

    Hi
    MYPRINTS: Hello world.
    MYPRINTD: 1234567890
    [reset.S:32]

The last line is from MYMARKER() macro, which I think most useful.

Once the program jumps to C, I can use a bit better ones.

    #include "mydebug.h"

    int main()
    {
    MYPRINTS("MYPRINTS: Hello from C.\r\n");
    MYPRINTS("MYPRINTD: ")
    MYPRINTD(1234567890)
    MYCRLF()
    MYMARKER()
    }

The output looks like this:

    MYPRINTS: Hello from C.
    MYPRINTD: 1234567890
    [main.c:52 main()]

Here is mydebug.h. Use it at your own risk.

        //
        // For assembler program.
        //
        #ifdef __ASSEMBLER__

        #define MYPRINTC(C)                                             \
                li      t0, SIO_TXFIFO                                  \
                li      t1, C;                                          \
                sw      t1, (t0);                                       \

        #define MYCRLF()                                                \
                MYPRINTC(0x0d);                                         \
                MYPRINTC(0x0a)

        #define MYPRINTS(S)                                             \
                b       1111f;                                          \
                nop;                                                    \
        8888:   .asciz  S;                                              \
                .align  2;                                              \
        1111:   la      t2, 8888b;                                      \
                la      t0, SIO_TXFIFO                                  \
        2222:   lb      t1, (t2);                                       \
                beqz    t1, 9999f;                                      \
                nop;                                                    \
                sw      t1, (t0);                                       \
                addi    t2, 1;                                          \
                b       2222b;                                          \
                nop;                                                    \
        9999:   

        #define MYPRINTD(V)                                             \
                li      t0, SIO_TXFIFO                                  \
                li      t2, V           /* value */;                    \
                li      t3, 1000000000  /* divisor */;                  \
                li      t4, 10;                                         \
                li      t5, 1;          /* leading zero flag */;        \
        1111:   divu    t2, t3;                                         \
                mflo    t1              /* digit */;                    \
                mfhi    t2              /* mod */;                      \
                beqz    t5, 3333f;                                      \
                nop;                                                    \
                beqz    t1, 2222f;                                      \
                nop;                                                    \
                move    t5, zero;                                       \
        3333:   addi    t1, '0';                                        \
                sw      t1, (t0);                                       \
        2222:   divu    t3, t4;                                         \
                mflo    t3;                                             \
                bnez    t3, 1111b;                                      \
                nop

        #define MYMARKER()                                              \
                MYPRINTC('[');                                          \
                MYPRINTS(__FILE__);                                     \
                MYPRINTC(':');                                          \
                MYPRINTD(__LINE__);                                     \
                MYPRINTC(']');                                          \
                MYCRLF()

        //
        // For C program.
        //
        #else // #ifdef __ASSEMBLER__

        #define MYPRINTC(C) {                                                   \
            *(volatile unsigned int*)SIO_TXFIFO = C;                            \
        }
        #define MYPRINTD(V) {                                                   \
            char buf[20];                                                       \
            int n = 0;                                                          \
            unsigned int val = (V);                                             \
            do {                                                                \
                unsigned int digit = val % 10;                                  \
                buf[n++] = '0' + digit;                                         \
                val /= 10;                                                      \
            } while (val);                                                      \
            while (n > 0) MYPRINTC(buf[--n]);                                   \
        }
        #define MYPRINTS(S) {                                                   \
            const char* p = S;                                                  \
            while (*p) MYPRINTC(*p++);                                          \
        }
        #define MYMARKER() {                                                    \
                MYPRINTS("[");                                                  \
                MYPRINTS(__FILE__);                                             \
                MYPRINTS(":");                                                  \
                MYPRINTD(__LINE__);                                             \
                MYPRINTS(" ");                                                  \
                MYPRINTS(__PRETTY_FUNCTION__);                                  \
                MYPRINTS("()]\r\n");                                            \
        }

        #endif // #ifdef __ASSEMBLER__

Binary search if cause is far away

I talked about debugging as binary search.

What if the problem is caused far away, and the symptom shows up later? For example, in the following example, suppose function1() inadvertently writes beyond the end of buffer[] and breaks the value of globalp.

    char buffer[128];
    char* globalp;
    function1() { ... }
    function2() { ... }

Later the other function2() uses globalp and goes south. This type of bug is one of the hardest to find. First, the OK-NG chart goes likes this:

    --OK----------------------------------------------NG---------
                |<--function1-->|             |<--function2-->|

There are two reasons why this type of buf is hard. By knowing the nature of the hardness, we may be able to better handle the bug.

The first reason is the OK point. We don’t know when the OK condition is disturbed. So we cannot start the binary search yet.

The second reason is the NG point. In this case the NG point is not where the real bug is. Instead, it is where the symptom appears, such as segmentation fault. But because it is the only reference point, we have to treat it as an initial NG point, and we should not forget that the real bug is not there.

First, we have to look into the NG point closely and see what is causing the symptom, such as segmentation fault. Once we find that globalp is pointing wrong place, we can rewrite the OK-NG chart.

    --OK----------------------------------------------NG---------
                |<--function1-->|             |<--function2-->|

The chart looks exactly same as before. But it has different meaning. NG is where globalp is pointing wrong place. OK is where globalp is sane. Now we can start the bianry search just by looking at the value of globalp.

The important thing is, when you draw the first OK-NG chart in your mind, you should know that it is only temporary. And you should do what ever to take you to the second OK-NG chart.

The last element may not be it

I talked about debugging as binary search in
https://droid7c2.wordpress.com/2009/02/10/debugging-as-binary-searchdebugging-as-binary-search/

The bug was the kind which lives in time axis, on which we performed binary search.  This time, I would like to talk about a bug that lives in space. And in this case, we cannot do binary search.

Suppose you got a new USB device. You attached to your PC, and the device didn’t work.

    PC --- USBdevice

Probably you start to brame the USB device. If you have purchased the device
you may call the customer service. If you have built the device, you might start to debug the device. And this is a typical way to waste time.

In fact, there are two cases you have to look into fairly.

    PC --- USBdevice

            |
            V

    PC  --- USBdevice2
    PC2 --- USBdevice

My point is to swap an element at a time to get OK case. You try to (1) swap USBdevice with the second one, or (2) swap PC with the second one.
You may not have the second USB device, then you cannot do (1). But if you
have the second PC, (2) is easy to do. You should look at those possibilities equally to get OK point.

But is this all? No. If you see the setup in the following way, you may want to swap USB cable, too. By the way, this really happened to me. After tweaking USB device quite a while, I found the bad was the USB cable.

    PC --- USBcable --- USBdevice

And that is not the end. For example,

    PC --- USBdriver --- USBcable --- USBdevice
                                          |
                                      ACadapter

Because there are so many elements are involved, binary search is not suitable for this kind of debugging. I usually swap one element at a time.  However, one rule is still valid. Get OK and NG cases first.