"While the greater semantic web shows great promise, it is going to be an immensely complex process linking up all content on the entire internet as it continues to grow - effectively tagging everything online. The task is arguably much more viable within the manageable boundaries of a specific business environment.

Information context is immensely important and powerful within the enterprise. We are all familiar with the huge document graveyards multiple Sharepoint shared drives and wikis can become - we spend time laboriously finding the location of information and then arriving at the single destination where it lives, over and over again.

While we have some very effective current generation enterprise 2.0 social network style software solutions for collaboration, there is an ongoing huge issue of scaling: an online environment for a team of say 40 can break down when it expands to 400, choking on its own success in a sea of information, for example. This is not the fault of the software but rather of the content being created and uploaded, which lacks the metadata which enables machines to filter and ‘read’ and process it."

1 comment:

Michael Belanger said...

The tagging can be done with good NLP software and small domain specific vocabulary intensive ontologies. The results are a contextual graph of the information. That graph should then be fragmented into unique IDs which should then be operated as a node in a distributed index. User queries and routing engine persistent agents will use the same approach to achieve contextual need graph fragments of there own - the index operation then scatter-gathers each "need" through fuzzy (thesaurus-intensive) matching. This approach (Jarg's) scales well with high performance while quality and control remain with local domains.