
Sally - A Tool for Embedding Strings in Vector Spaces
Copyright (C) 2010 Konrad Rieck (konrad@mlsec.org)
--

Development of Input Modules

Sally can be easily extended to support different input formats for
strings. The development of a new input module basically involves
three steps:
  
  1) Create the source files 'input_xxx.c' and 'input_xxx.h' for the
     new input module xxx. Add these files to 'Makefile.am' to include
     them in the compilation process of Sally.
  
  2) Implement three functions in 'input_xxx.c' and add respective
     declarations of these functions to 'input_xxx.h'.
           
     int input_xxx_open(char *name);
     
     This function opens the input source for reading of strings. For
     example, if 'xxx' refers to an archive, this functions
     corresponds to opening the archive and preparing it for loading
     entries. The function returns the number of available entries.
     
     int input_xxx_read(string_t *strs, int len);
     
     This function reads a block of strings. The parameter 'strs' is
     used to store the loaded strings and respective information (see
     input.h) The array need to be allocated by the caller, where its
     length is given in 'len'.  The function returns the number of
     loaded strings; 0 indicates the end of the input source.
       
     void input_xxx_close();
     
     This function closes the input source for the format xxx. Memory
     allocated by input_xxx_open() should be freed here. Open files
     and similar objects should be closed.
       
  3) Integrate the new interface into Sally by extending the code in
     'input.c'. First, add 'input_xxx.h' to the list of included
     headers and, second, extend the function input_config() to
     initialize the new input format if requested.
     
 That's it. Please contribute to the development of Sally. Send your
 new modules to konrad@mlsec.org, so that they can be included in the
 next release.
  