WWW'2007 - Trip report by Raphael, Zeljko and Lynda

16th International World Wide Web Conference (WWW 2007)

8-12 May 2006, Banff, Canada

16th International World Wide Web Conference (WWW 2007)
Author: Raphael, Zeljko, Lynda
CWI participants: Raphael, Zeljko, Lynda, Steven, Ivan
# participants: around 1000

1. Overall Impression
2. W3C Advisory Committee Meeting
3. Web Accessibility (W4A) Conference
4. WWW 2007
5. Posters session

1. Overall Impression

Wonderful place for this year Web conference. A nice video (recorded) has shown the history of the web conferences. New record in number of papers submitted!

2. W3C Advisory Committee Meeting

Some notes about the W3C Advisory Committee Meeting:

Lightning Talk: Multimedia Semantics XG: Challenges and Opportunities

http://www.w3.org/2005/Incubator/mmsem/talks/AC2007/ Raphael's slides
This is what the audience got of my talk (from the IRC log)!

MMSEM XG had 37 participants from 15 organizations
  ... many organizations work on multimedia metadata standards
  ... we wanted to show the benefits of combining all these metadata formats into RDF
  ... and provide best practices and use cases for using multimedia on the web
  ... example: EXIF + low-level feature extraction using MPEG7, stored into flickr, combined with 
metadata from other standards
  ... another use case on music: combining MP3 metadata with FOAF
  ... knowing where your favorite artists are giving concerts
  ... see panel in W3C track this Thursday
  ... and Photo Metadata Conference next month
  ... making a call for a follow-up XG

I had interesting chats with the AC rep of Adobe and Apple for a possible involvment in a future follow-up XG activity. I need to contact IBM (John Smith), Yahoo! (Mor) and Boeing (Mike Uschold) as well

Special Session: The Role of Web 2.0 in W3C and the Role of W3C in Web 2.0

Tim O'Reilly: What is Web 2.0?
Interesting slides, Tim made a parallel between the Web 2.0 and the open source paradigm shift. People aren't realizing that Google is one of biggest Linux apps!
Interesting slide on "Understanding open source" (* modular architecture, * internet-enabled collaboration, * users as co-contributors, * viral distribution and marketing).

Key sentence: Data is the next "intel inside". Web 2.0 sites putting up friendly front-ends on existing database, and opening them to user contributions.
Tim Berners Lee:
One way of looking at the value of the Web -- when you put something on the Web, you get unexpected and serendipitous reuse. With the Web, basic reuse happens by following links. It's a myth to say that we don't need links... google is using the links.

Web 2.0, compared to sem web, gives you _limited_ reuse. As TimO points out, current incentive is for sites to jealously guard data. But that limits reuse. But users will demand data back, and we'll start to see more reuse outside of single sites. When you have more linking among user data (e.g., same person, same protein, same place, etc.) ...you'll get reuse across organization and application boundaries. I want to view all the photos of me, no matter who uploaded them and where. Or I want to reuse information about protein no matter where the data is. We are currently only getting a fraction of of the potential reuse of our data.
Doug Scheppers: W3C 2.0: A Progress Report
Two new groups: WebAPI and Web Applications Formats WGs. WAF WG: the focus is new syntax and new languages. WebAPI WG: Web application programming interfaces - dom and other interface.
Ted Guild:
On our internal systems and how they are web 2.0 like. The vast majority of our systems are geared towards being collaborative systems for a broad number of users. Some of this data already available in RDF, and through UIs. The challenge we will face is to make more of it available, and to associate people with data they seek. Need to see whether in accordance with our privacy policies. Tim's tabulator is a playpen for browsing data. We are using our own technologies, but also external (non w3c) systems, especially for inspiration.

Discussion

Kingsley: Perhaps the semantic web is web^3: I think open data is a feasible business model.

TimO: "licensed access" rather than "open access"

Paul Downey: Some people think about web 2.0 as "rich user experience"; TimO indicates that it's 
really a business model. Privacy today is like sex + drugs in the 1960s. w3c could help with 
identity around open data. I might want to mash up my photos with my back account information, but 
don't want to make the bank info public. I think the next stage is to mix data that is public, but 
also data that I don't want to be public.

TBL: Value of Web is to make life smoother for people (e.g., finding flights, etc.). A world 
where people have access to all the data they logically and legally can access, is a more 
powerful one for the user.

TimO: look at how the credit card emerged. The credit card is an aggregator (of access to banks).
Banks were not made more interoperable; credit card companies are aggregators.
Opportunity for aggregation where there is not interop. Those become the new centralization sites.

TBL: Yes, they'll be able to do this for bank data, but not for all data. And that bank aggregator 
will be an element within a much larger Web of data. At the end of the data, we are still going 
to end up with an economy and then ecology.

TG: We need to find ways to delegate authority to other service in the face of aggregation.
e.g., w3c wants to make a lot of member-only data to members in RDF. Want to ensure that it 
remains confidential when used by an aggregation service.

The web2.0 maturity model? http://files.skyscrapr.net/users/jevdemon/YetAnotherAcronym_A1D2/image0_thumb8.png

3. International Cross-DisciplinaryConference on Web Accessibility (W4A)

General impressions:

W4A tries to get together very diverse set of researchers. W4A tries to provide unified view on many diverse areas, and to stimulate discussion among different communities. Although identified as important by many, Web accessibility and general accessibility are not very popular topics. Work in this area is distribued and carried out by small teams. This discussion and influence of accessibility research are still in the early stage.

Particularly important aspect of the conference is that disabled users are directly involved and participate in the conference. For example, two out of three reviewers for speech browsers challenge competition are blind users.

Most useful elements of the conference for me was contact with main researchers in Web accessibility, and contact with disabled users. I definitelly learned new thigs, especially about motivation for my work. I have also had lots of discussion with people from W3C Web Accessibility Initiative (WAI). Some new ideas that I have got include usage of multimedia metadata to improve accessibility of multimedia data on the Web (for example, reusing K-Space annotation for this). This will also be an issue in the next Multimedia Semantics XG.

KEYNOTE: Enabling an Accessible Web 2.0

Becky Gibson

Many different Web 2.0 definitions. According to Becky, following properties are crucial: dynamic, interaction, collaboration. Nothing particularly new in this definitions.
Most important technologies underlying Web 2.0 are Scripting, CSS, Ajax , Multimedia. AJAXtechnology is here quite a long time, but was not accepted until it did not get "cool" name.
Web 2.0 accessibility concerns:
- Rich interface control, reliance on mouse - confirms motivation for our work
- Unexpected updates via Ajax, change in focus
- Multimedia - lack of captioning, interaction issues - nice motivation for our future work about semantic multimedia accessibility
Key to Web 2.0 accessibility is ADDING SEMANTICS to the Web! That's what we are saying too.
We need development environments - can AMICO help?
Interesting point: you are not competitive if you are not using Javascript and CSS.

KEYNOTE: Web 2.0: Hype or Happiness?

Mary Zajicek

Treating accessibility very widely, HCI background (but bad slides, 17 lines or more)
Very interesting new term HIJAX (AJAX for handicapped)
Specialty in accessibility for older people
Is AJAX good or not, and why?
For me, the most interesting part of presentation is actually seeing some of the disabled people and hearing their story and opinions about Web accessibility. Very interesting, and new perspective for me.
- A risk of ghettoizing disabled users by online communities.
- One more new perspective: if you do not see the benefits, it is inaccessible for you - there is some kind of barrier there.
Age related impairment = memory loss and cognitive impairment

Communications Paper: Position Paper: Accessible Image File Formats - The Need and the Way

Sandeep Patil

Basically, proposed a new image format that can embed annotation. Looks very inferior and limited compared to any semantic web annotation approach. Introducing new file format that merges image and simple structured annotation does not seems to be a good idea.

Communications Paper: The National Accessibility Portal: An Accessible Information Sharing Portal for the South African Disability Sector

Louis Coetzee et al

Just an overview of activities that accessibility portal in Africa offers to disabled users in Africa . Interesting statistical data, but not too much interesting or new information for me.

Communications Paper: A Preliminary Usability Evaluation of Strategies for Seeking Online Information with Elderly People

Sergio Sayago

Case study very similar, but complementary to Alia's study of information search in cultural domain. This study investigates how efficiently elderly users use basic and advanced forms of information search in Google, Yahoo...: basic search is much faster for elderly users, advanced search mechanisms introduce information overload, which reduces the performances of elderly users sometimes even three time.

Keynote: Accessibility of Emerging Rich Web Technologies: Web 2.0 and the Semantic Web

Michael Cooper

Overview of W3C activities about accessibility of rich Web technologies.
Benefit of novel Web 2.0 and Semantic Web:
- Alternate interface: designed to simplify the interface. Examples: several simpler interfaces to Amazon.com.
- Automated discovery of Semantics
Challenges:
- Design challenges: redefined or absent semantics, more distraction (advertising, games, stock indicator...) - easy to provide today but can be very distracting, more complexity
XAG - XML Accessibility Guidelines
- Meta-specification for accessibility
ARIA: Accessible Rich Internet Applications
- How to make Web 2.0 interact with Accessibility API

Technical Paper: Quantitative Metrics for Measuring Web Accessibility

Markel Vigo et al

Keynote: Semantic Web: The Story So Far

Ian Horrocks

Nice overview of Semantic Web, but nothing more. No link to accessibility.

Technical Paper: Experimental Evaluation of Usability and Accessibility of Heading Elements

Takayuki Watanabe:

Very good paper, I read it carefully during the reviewing. It won the best paper award. It can serve as a good example how to do experimental studies in Web accessibility. Presentation, unfortunately, was not very good, as it contained too many details, and did not enough emphasized the main outcome of the paper.

4. WWW 2007

4.1 Keynote: The Two Magics of Web Science, Tim Berners Lee

Slides: http://www.w3.org/2007/Talks/0509-www-keynote-tbl/
The keynote has been recorded, we have the video!

My opinion: The first part was interesting and I think that the circles for the web science will be reused a lot. The second part, about the shape of the data and the challenges were general statements and known things, not really interesting. Globally, a good TBL talk because of the first part (I have seen worst from him :-)

Poll: who made 1 WWW conference?, 2?, 3?, 4?, etc. 15?, 16? (only him!)
Analogy with Physics: from the micro phenomenon to the macro phenomenon. This is all the story of the web. Nice diagram showing how the sum of micro effects make an emergent phenomenon (6*10^9 users, 10*10^9 web pages). The macro phenomenon can then be analyzed, which raise issues. Is this what we want? This is the magics of web science!

The NEW thing: The web science circle

Magic = stuff you don't understand yet = collaboration and creativity
Engineering = design, implementation
Science = analyze

Examples: instanciation of this circle with the:
(semantics: → = from the micro to the macro; ... BUT = the issues raised)

Email: Need to communicate, internet messages, SMTP, email → interconnected academia ... BUT spam!
WWW: Need international collaboration, editable hypertext mesh, URI, HTTP, HTML → World Wide Web ... BUT can't find stuff (web explosion)!
Google: Can't find stuff, index+SVM, Eigenvector algorithm, Google site → Google phenomenon ... BUT Google spoofing!
Wiki: Can't write stuff, use forms to edit content, form-based editors, wiki → wikipedia ... BUT wiki battles!
Blog: Need to write, column+comments, authentification+editor+trackback, blog → blogosphere
Semantic Web: Can't reuse web data, data sharing, RDF+OWL+SPARQL+RIF, Semantic Web → large integrated dataset ... BUT can't explore data!

Between the lines: I think TBL tried to take some distances with OWL, rules, and all these inference mechanism. For him, the key point is URIs + RDF + HTTP. URIs need to be dereferencable and use the HTTP protocol. The key point is to have linked data ... and the Tabulator! I see that like another shift from the academic view of the KR-SW world. The key question is then why people should expose their data (protein example), what is the added value for them? This is the link incentives!

Shapes of data: lines (tape, cards); matrix (databases); trees (SGML, XML, OO); Net (internet?, WWW?)
What shape IS the WWW? THe shape is and should a fractal tangle! We should engineed for that.

Applications connected by concepts: example in biopax
The fractal tangle: communities will be of many sizes ... we have less experience when fractal is not constrained to a 2D surface.
Total cost of ontologies (TCO): estimation from TBL

Challenges for the Web science:

User Interface: "My apologies, I should have mentionned that 10 years ago!" Solution = SWUI
Information Policy: identity, privacy, appropriate use, transparency, trust models
Resilience: 404, Phishing, spam, wiki spam
Collective and quality asessment: needed by W3C, wiki ... more effective democratic systems
New devices: great diversity, portable things, developing countries
Collective creativity: more intuitive interfaces, creativity together

Web Science attitude is necessary: problems combine technical and social aspects, and problems have a function of the very large scale (The Web Science is multidisciplinary!)

Question and Answers:

Steven Pemberton: in the circles, every time you end up with spam ... except for the SW.
Can we block it now to not have SW spam ?
TBL: good question! The study on provenance of triples are fundamental. Trust will be also an 
important issue. The web is passive, you don't get web spam. You can have scrappy web sites, this is 
not the same thing.

??: About your view on the shape of the web.
TBL: the collective consciousness of the web. There is a common culture here. The working groups are
very important for making new technical specifications.

4.2 Panel: Web Science

What is Web Science?

Moderator: Phillip Hallam-Baker (Verisign)
Nigel Shaboldt (University of Southampton, UK)
Daniel Weitzner (MIT, USA): reuse again the TBL's circle for explaining the Web Science Method. Web science is different than a discipline
Peter Patel Schneider (Bell Labs, USA): Web Science = field or slogan?
The web is an object of study for area of informatics. What of this is peculiar to the web? Almost nothing since everything predate the web. Web science = an observational science, a branch of cultural anthropology!

Raphael: I have found that the defensors of the web science have very weak arguments to justify that it is not just about having a slogan (the web is not hype enough?) for getting more money, or like Peter said: an initiative → an institute → an empire!

Zeljko: Conclusion = NO CONCLUSIONS. Lots of philosophical discussion, little compromises in views.

Main questions: What is Web Science? Is it a new discipline or a new name for an old discipline? Is it a genuine academic discipline at all? What is a Web Science methodology? What is the core knowledge set that Web Science practitioners share? What does a Web Science paper look like?

Some interesting comments:

Relation with other disciplines: computer science, cognitive science, systems biology.
Web is very interdisciplinary.
One example of Web science: analysis of Web structure and graphs
- Canonical way of analysis the Web
One of the motivation for Web Science is to identify some unifying principles of the work that we do. This is maybe interesting motivation for canonical processes of media production.
Web Science: is this a field or Slogan? Is it a science or a way tot get money from funds?
The web is incredible cultural experiment that deserves to be studies (an observation science, a branch of cultural anthropology)
- Why are there no more cultural anthropologies at WWW conference
Maybe a branch of Software Engineering: WWW conference is not about Web science
Maybe Web Science is about how lots of people interact with lots of computers and with each other
When Web Science will be to?
- Web Science departments?
- Web Science course?
Question of values that drive the science
Differences between science and discipline

4.3 Web N.0: What sciences will it take?, Prabhakar Raghavan (Yahoo! Research)

He is Head of Yahoo! Research and Professor in Stanford.

Content: editorial, free, commercial
Audience: consume, enrich, transact
In the middle: AOL, Google, IAC, MSN, NewsCorp, Yahoo! → make buisenes putting that together
Search on the web: algorithm results = Audience (left side) and Advertisements = Monetization (rigth side)

Search and content supply: people don't want to search, they want to get tasks done (e.g: I want to book a vacation in Tuscany)
Information integration: information extraction and schema normalization. Semantics structure is not easy!
How do we cicumvent?: be incentive.

Statistics about the growth of content on the web. User-generated metadata (tags=100Mb/day; reviews=around 5Mb/day; ratings=small)
START metadata: Stars, Tags (label for retrieval or sharing), Access, Routing (community), Text

Example: flickr. No image analysis, but use community phenomenon. Why millions of users share and tag each others' photographs?
Challenges: How do we use these tags better? How do we cope with spam? What's the ratings and reputation system? What are the incentive mechanisms? (the ESP Game! or the Yahoo! questions/answers)

What asignment of incentives leads to good user behavior? What's "good" user behavior? Good questions, good answers, new questions ...? Whom do you trust and why?
Grand challenges: How do we retain and enrich participation? (online media experiences)
"I'm looking for a science that will retain and enrich participation"!
Flashback: HCI and CHI → the science of online audience engagement, not just about people interacting with computers or the web, but about people interacting with other people with the web as a medium.

Why do people choose to lurk or participate?
Why do people create new online personas?
Why are YouTube, Myspace and Flickr successful and others not?
What new genres are emerging, and what can we provoke?

Audience engagement in Second Life is much bigger. Sweden to set up embassy in Second Life!
What does it mean to have an engaged audience? Who cares? (advertisers, plus media and users)
New audience metrics? → funny formula but not totally implausible!
Grand challenge (again!): devise and standardize defensible metrics of online engagement and use these to predictively devise online experiences (not a substitute of creativity).

Microeconomics meets CS? Talk about how matching ads to query and context (IR), and how to order the ads + pricing on a click through (economics).

A new convergence? Computing meets humanities like never before (sociology, economics, anthropology, ...).
Conclusion: Vanevar Bush (As we may think) ... quoted sentence.

4.4 W3C Track: Advances in Semantic Web

Sandro Hawk: Rule Interchange Format Work Report

Problems: rule language and rule-based systems. What does RIF do for the SW? a companion to OWL, can express difference kinds of things, data integration and ontology mapping
Business rules perspective. Diversity: simple rule system vs cutting-edge research, desire vs capability.
Plan: Start with a common core (commonalities in rule system = 60%) → "RIF Core"
Then work on extension. Focus on translatng to/from RIF: "Format" instead of "Language".
People: RuleML (cwm), SWRL, SWRL-FOL, SWSL, WRL; Business rules vendors (ILOG, Oracle), commercial research (IBM, HP), non-commercial research (DERI, MITRE, REWERSE), others
Progress: RIF Use Cases and Requirements (2nd WD), RIF Cord (1st WD), RIF Architecture (not yet published)

The OWL compatibility is in the charter, but they don't know how to do it :-(

Harry Halpin and Fabien Gandon: Bootstrapping the Semantic Web with GRDDL, Microformats, and RDFa

Where is the SW data and documents? GRDDL is a markup for declaring that an XML document contains semantics metadata. Microformats (hCal, XFN, hCard). Microformats have limits: can't be validated, no standard way to get the data out of the HTML, too domain-specific.
GRDDL make the microformats data viewable as SW data.

Four documents produced: Spec, Test Cases, Primer and Use Cases.
Show a lot of use cases that benefit from using GRDDL for extracting the RDF data from the micro-formats.

How to make your page a GRDDL source?

Write an XSLT (that converts your XML into RDF) and reference it from your page (3 lines)
If such an XSLT already exists, just reference it!
Notion of profiles ... I don't get that :-(

4.5 Design for the World Narrow Web, Bill Buxton (Microsoft Research)

Main message: Don't just take benefits of the Web - we also have to take responsibilities.

We are still at situation where a technology is still in the forefront.
If it is visible it failed. – If you are a good plumber no one will notice your work. You only notice the plumber when something is broken. Less intrusion.
Adding more functionality cannot work any more. It ended after 20 years on PC, and after 3 years on PC.
You cannot see the difference between account and animation/research department – the tools are the same
Bigger is not better it’s just bigger.
The society of appliance and how they work together.
Diversity of information displays: in form, in content, in context
The lesson: every loudspeaker is also a microphone: This is not true for displays? From RGB pixel to RGB eye
Society for information displays
It will about time…
None of our software is designed as -> time and space as first time citizenships
There is not information revolution - we have data explosion
OOM Rule Rules (OOM – order of magnitude); if order of magnitude changes it is not the “same thing just bigger”
As a culture as a society we make decisions about the values
Can we do the same with the Web

4.6 Research Session: Semantic Web

Christian Halaschek-Wiener: Toward Expressive Syndication on the Web

Web Ontology Language (OWL) for syndication: motivation example to define what is a Risky Company in the financial domain, so that given news information about the products of a company a reasoner can predict the risk or this company and anticipate the auctions.

OWL-Based Syndication framework: OWL reasoning is hard and static (consistency of the entire KB needs to be rechecked), so how to make this practical?
Recent work on incremental consistency checking under instance updates.
Simple problem for the query/answering: reduce the portion of KB that must be considered for a query given an update.

David Huynh: Exhibit: Light-weight Structured Data Publishing

OK, it is a cool presentation from the PiggyBank MIT guy. You just add more attributes in your HTML, and thanks to some cool JSON/AJAX stuff, you display cool stuff on your web page (calendar, timeline, maps, facetted browser, etc.) → Exhibit: a new micro-format?. It pretends to address the publishing needs and desire of the persons who want to publish structured semantic web data on sophisticated interfaces ... BUT, no evaluation whatsoever that 1/ there are user needs for such functionalities and 2/ Exhibit actually address these needs.

4.7 Research Session: Accessibility of the Banff mountains

Zeljko Obrenovic: I did Green slopes	Raphael Troncy: I did the worldcup downhill

5. Posters session

Some posters I have found interesting:

Adaptive Faceted Browser for Navigation in Open Information Spaces: no demo available yet, but has developped techniques for generating automatically the facets. The final interface was not impressive. I have taken their card and encourage them to look at slashfacet. Michiel, something that could enter in your survey?
Generating Efficient Labels to Facilitate Web Accessibility:
Summarization of Online Image Collections via Implicit Feedback:
Towards Extracting Flickr Tag Semantics: