Your mark is based on your last submission.

For questions regarding the marking program, please contact our GA Yaxin Li at

Regular expression in Java (5 points)

Due time: Midnight Jan 28, 2017

Where to submit

You can submit your assignment at our submission site You can submit 10 times, and the last mark will be your final mark. You are welcome to contact our GAs for helping you with the assignment.

Here is a list of common errors that you may want to double check before submission:

  • print statement: our marking program does not allow it. you won't be able to see the print result on our server anyway. so remove print statements;
  • inner classes: the marking program changes the class names, but only to the outmost class name. If you have an inner class, there will be two different class names. There is no need to declare an inner class except confusing the marking program. So remove innner class.
  • Sometimes the output file is empty because you removed close() statement to save the space. But you can not save the space in this way .
  • Avoid extra-long line.


    Warm-up with Java programming. Get familiar with regular expression. Understand the wide application of regular expression.

    Assignment specification

    Your job is to count the number of identifiers in programs written in our Tiny language.

    You should pick out the identifiers from a text file, and write the output to a text file (named A1.output). Note that the output file should contain a line like "identifiers:5" . Here are the sample input and output files.The input will have multiple lines. Please note that in this sample program the following are not counted as identifiers:

    Here are the test cases for the assignment: case 1, case 2, case 3, case 4, case 5, case 6. (ID counts: 5 4 6 7 8 9)

    In this assignment you can suppose that there are no comments in the programs.

    In the output file you should only write "identifiers:" followed by the number of identifiers. If there are multiple occurrences of an identifier in the input, you should only count it once. Don't write anything else into the output file.

    You will write two different programs to do this:

    1. Program is not supposed to use regular expressions, not regex package, not the methods involvoing regular expression in String class or other classes. Hence it will rely on StringTokenizer. Please refer to API JavaDoc for more details of the StringTokenizer specification.
    2. Program will use java.util.regex. Two useful links to start with are JavaDoc of regex and a  tutorial for Java regex
      In A12, you should not use StreamTokenizer or StringTokenizer.

    Your programs should be able to run by typing:

      %java A11  A1.tiny
      %java A12 A1.tiny

    In this assignment, the output should be in a file called "A1.output". You should not use keyboard input. The input file name will be provided as the argument of the program, while the output file name is hard coded in your programs. i.e., your code regarding input and output can be like the following:

                ...  new BufferedReader(new FileReader(args[0]));
                ...  new BufferedWriter(new FileWriter("A1.output"));
    Your program should be tested on luna or bravo.

    Please don't write unnecessarily long programs. The sample solutions for A11 and A12 consist of approximately 300 words altogether by PHP function str_word_count(), which are not written deliberately for short length and can be compacted into smaller sizes easily. Hence one mark is given if your wordcount is smaller than 300.

    Marking Scheme

     if (, are not sent properly) return; 
     for (each of A11, A12) 
          if (it is compiled correctly)  yourMark+=0.2;  
     for (each of A11, A12){
              if (your java program reads A1.tiny && generates result file A1.output)
                         for (each of the 6 tests cases) 
                               if (it is correct)   yourMark+=0.3;
              if youCode.length() < average(length of A11 in the class) yourMark+=0.5;
      for (each day of your late submission)  yourMark=yourMark*0.8;
      One bonus mark for the shortest code among the class.

    What to submit

    You should submit and