More Data Validation
Consider the following program
#include <stdio.h> main() { int number; printf("Please enter a number\n"); scanf("%d", &number ); printf("The number you entered was %d\n", number ); }
The above program has several problems
Perhaps the best way of handling input in C programs is to treat all input as a sequence of characters, and then perform the necessary data conversion.
At this point we shall want to explore some other aspects also, like the concepts of
Trapping Data At The Source
This means that the validation of data as to its correct
range/limit and data type is best done at the point of entry. The
benefits of doing this at the time of data entry are
The Ripple Through Effect
This refers to the problem of incorrect data which is allowed to
propagate through the program. An example of this is sending
invalid data to a function to process.
By trapping data at the source, and ensuring that it is correct as to its data type and range, we ensure that bad data cannot be passed onwards. This makes the code which works on processing the data simpler to write and thus reduces errors.
An example
Lets look at the case of wanting to handle user input. Now, we
know that users of programs out there in user-land are a bunch of
annoying people who spend most of their time inventing new and
more wonderful ways of making our programs crash.
Lets try to implement a sort of general purpose way of handling data input, as a replacement to scanf(). To do this, we will implement a function which reads the input as a sequence of characters.
The function is readinput(), which, in order to make it more versatile, accepts several parameters,
We have used some of the functions covered in ctype.h to check the data type of the inputted data.
/* version 1.0 */ #include <stdio.h> #include <ctype.h> #define MAX 80 /* maximum length of buffer */ #define DIGIT 1 /* data will be read as digits 0-9 */ #define ALPHA 2 /* data will be read as alphabet A-Z */ #define STRING 3 /* data is read as ASCII */ void readinput( char buff[], int mode, int limit ) { int ch, index = 0; ch = getchar(); while( (ch != '\n') && (index < limit) ) { switch( mode ) { case DIGIT: if( isdigit( ch ) ) { buff[index] = ch; index++; } break; case ALPHA: if( isalpha( ch ) ) { buff[index] = ch; index++; } break; case STRING: if( isascii( ch ) ) { buff[index] = ch; index++; } break; default: /* this should not occur */ break; } ch = getchar(); } buff[index] = 0x00; /* null terminate input */ } main() { char buffer[MAX]; int number; printf("Please enter an integer\n"); readinput( buffer, DIGIT, MAX ); number = atoi( buffer ); printf("The number you entered was %d\n", number ); }
Of course, there are improvements to be made. We can change readinput to return an integer value which represents the number of characters read. This would help in determining if data was actually entered. In the above program, it is not clear if the user actually entered any data (we could have checked to see if buffer was an empty array).
So lets now make the changes and see what the modified program looks like
/* version 1.1 */ #include <stdio.h> #include <ctype.h> #define MAX 80 /* maximum length of buffer */ #define DIGIT 1 /* data will be read as digits 0-9 */ #define ALPHA 2 /* data will be read as alphabet A-Z */ #define STRING 3 /* data is read as ASCII */ int readinput( char buff[], int mode, int limit ) { int ch, index = 0; ch = getchar(); while( (ch != '\n') && (index < limit) ) { switch( mode ) { case DIGIT: if( isdigit( ch ) ) { buff[index] = ch; index++; } break; case ALPHA: if( isalpha( ch ) ) { buff[index] = ch; index++; } break; case STRING: if( isascii( ch ) ) { buff[index] = ch; index++; } break; default: /* this should not occur */ break; } ch = getchar(); } buff[index] = 0x00; /* null terminate input */ return index; } main() { char buffer[MAX]; int number, digits = 0; while( digits == 0 ) { printf("Please enter an integer\n"); digits = readinput( buffer, DIGIT, MAX ); if( digits != 0 ) { number = atoi( buffer ); printf("The number you entered was %d\n", number ); } } }
The second version is a much better implementation.