Course Overview

This course introduces theoretical and practical aspects of both the current web and the semantic web. Topics will include web characteristics, web graph and analytics, techniques in building a search engine inluding crawling, page ranking, and latent semantic indexing. We will also cover web ontology.

Time and Place

Schedule for final project presentation

Note that each presentation should be 15 minutes long, and there will be 5 minutes for questioning and answering.

Each group should submit a final project report containing no more than 1000 words no later than December 15. A web site should also be available by December 15 for more details of the project including a live demo.

    Nov 26
  1. group 5: Nwokeoji , Nnamdi Kingsley web site
  2. group 6: Reid , James H. and Willis , Jordan T. Web site
  3. group 7: Sun , Xu and Zheng , Shaochen. web site
  4. group 8: Wu , Jiayi and Zhang , Yi. web site
Nov 28
  1. group 1: Agboola , Adewale Taiwo and Chachad , Amrutraj. web site
  2. group 2: Cao , Xiao Ni and Cui , Xiutian. web site
  3. group 3: Chittle , Joshua David and Donais , Jonathan Alexander web site
  4. group 4: Gherani , Mohit and Sumon , Mahmud Hossan. web site

Tentative content of the course includes:
  • Week 10, 11,12: Lecture slides: Matrix decomposition and latent semantic indexing. slides by Jure Leskovec , LSI slides (pdf) , Graph sampling slides (pdf) .
  • Week 13: Project presentation and evaluation

    Ontology Data

    The following table lists some ontologies and their corresponding graphs.
    Ontologygraph txt file graph
    1. SUMO.owl (323KB) sumoSubclass.txt(~700 edges) subclass fdp layout
    2. All the relatestion extracted using jena.
    The java program
    sumoAll.txt(~2700 edges) sumoAll fdp layout, circle layout, twopi layout
    3. dbpedia.owl(~700KB) graph txt
    4. SUMO2.owl(~30MB) graphviz txt twopi layout(79MB) , a 3D fig produced by Wang Hao
    5. SUMO2 subclass (5000 edges) graphviz txt fdp fig ,
    dot fig ,
    twopi fig ,
    6. SUMO2 subclass & domain range(7000 edges) graphviz txt fdp fig ,
    dot fig ,
    twopi fig ,
    7. SUMO2 domain range (1700 edges) graphviz txt fdp fig ,
    dot fig ,
    twopi fig ,
    8. SUMO2 classes graphviz txt fdp fig ,
    dot fig ,
    twopi fig ,
    The semantic web data are extracted from various sources. For DBPeida data, here is a sample java program to construct a graph from SUMO2 (~30MB) ontology. The output is a txt file depicting the graph in graphviz format.

    The graph is then drawn using graphviz. Grapphviz can create several different layouts from a graph definition. One layout is called fdp, which uses the spring model so that nodes having stronger connections are put closer on the canvas. The fdp layout reault is here . Graphviz can be downloaded here. The command to run graphviz is here. It also shows how to generate the different layouts.

  • Reference books