requirements for the Key Words in Context System, Initial System ================================================================ assignment 6, Dec. 5, 2002 - required to run on Unix systems only - runs on standard PC hardware - all the requirements from the slide input interface: - read from file - file name: from command line argument - contents: - 8-bit ASCII - line separation: linefeed - word separation: blank, tab - locale: C - size: - lines max 1024 chars - max 1000000 lines - max 10000000 chars - may use temporary files - sorting: - by ASCII order - output: - write to file - name: from command line argument - just the shifted, sorted lines - 8-bit ASCII - line separation: linefeed - word separation: blank - command line arguments - two, both mandatory - first: input file - second: output file - should always terminate if the other requirements are met - some "reasonable" time for processing modules and their interfaces ============================ general: use C++ as implementation language - line holder module - exported data types: - text - line - word - character - interface programs: (for the parameters: all counts start at 0) - inline void text::setChar(unsigned int lineNo, unsigned int wordNo, unsigned int charNo, character ch) - inline character text::getChar(unsigned int lineNo, unsigned int wordNo, unsigned int charNo) - inline unsigned int text::lines() - inline unsigned int text::words(unsigned int lineNo) - inline unsigned int text::characters(unsigned int lineNo, unsigned int wordNo) - input module - interface programs: - text *input(const char *inFileName) - circular shifter module - interface programs: - class circularShift: public text - circularShift::circularShift(text &inText) - sorting module - interface programs: - class alphSort: public text - alphSort::alphSort(text &inText) - output module - interface programs: void output(const char *outFileName text &inText) - master control module - interface programs: - int main(int argc, char *argv[]) - command line module - interface programs: - cmdLine::cmdLine(int argc, char *argv[]) - const char *cmdLine::inFileName() - const char *cmdLine::outFileName() open problems: ============== - Are multiple, immediately consecutive word breaks preserved? -> proposal: no - How do word breaks sort? -> proposal: like a single blank - We don't need the data types "line" and "word" in this setup. -> proposal: delete them