Archive for the ‘Uncategorized’ Category

Readings

April 4, 2009

1. MetaCarta GeoParsing API

2. Open Calais :Calais is a rapidly growing toolkit of capabilities that allow you to readily incorporate state-of-the-art semantic functionality within your blog, content management system, website or application.

3. Tweetburner

4. fuseURL

5. http://dittes.info/blog/2008/06/04/presseclipping-20-twitter-monitor/

6. http://www.wefeelfine.org/   :  An exploration of human emotion

7. http://www.monitter.com/   Monitoring Twitter [could insert a definition/word in order to get the exact phrase ]

8. http://www.tweetdeck.com/beta/  : aims to make all your Twitter experience as accessible as possible.

You can put your followers into groups, and save keyword searches. The drawback to Tweetdeck is that the timeframe is only 48 hrs, so aything after that can not be accessed. I still do think that TweetDeck is a break through act and is still in Beta so expect more to come.

9. http://hounder.org/index.html : Hounder is a simple and complete search system. Out of the box, Hounder crawls the web targeting only those documents of interest, and presents them through a simple search web page. Hounder is:

  • configurable: Hounder can be customized for your particular needs. Use it as a standalone solution or as a building block for your existing application.
  • scalable: Hounder grows with your needs. Start with a single machine, add more as needed.
  • robust: The big bad web is no place for puppies. Hounder’s searcher has been designed from the beginning to survive traffic surges.
  • easy to integrate: Hounder can be used from various languages, such as Java, PHP, Python, etc.
  • trainable: point Hounder’s attention to the information you want by feeding it with training sets, then sit back and watch it fetch the right documents for you.

10. http://code.google.com/p/wiz4j/   :  Wiz4j offers a simple and rapid method to create wizards in java.

Its quick and easy, no need to create xml files: just use java code to define the wizard structure.

With wiz4j, your wizard will run both in GUI and command line mode.

11. http://tweetstats.com/trends

12. http://www.trackur.com/ : Trackur is an online reputation monitoring tool designed to assist you in tracking what is said about you on the internet. Trackur scans hundreds of millions of web pages–including news, blogs, video, images, and forums–and lets you know if it discovers anything that matches the keywords that interest you.

13. http://www.twitscoop.com/about : Through an automated algorithm, twitscoop crawls hundreds of tweets every minute and extracts the words which are mentionned more often than usual. The result is displayed in a Tag Cloud, using the following rule: the hotter, the bigger

14 .

Articles

December 31, 2008
 
 

Grid Explained – The basics of Grid

 
 
 
To drive sustainable business growth, organizations are constantly driving their information technology systems to produce business results sooner and make them available to a wider audience, with improved accuracy and usefulness at the point of need. At the same time, organizations are looking to embrace new computing technologies such as provisioning, orchestration and virtualization and the emerging new standards for interoperability to improve their systems’ resilience and make more efficient use of human, technology, and capital resources.
 
 

CIO Magazine about Grid Computing

 
 
 
Special Report
Grid computing has been on the horizon for a long time. Some prognosticators say it’s the next big thing. But it’s SO big, it’s daunting for many. This collection looks at the basics of grid as it moves ever (so slowly) closer.
 
 

Getting Started With Grid

 
 
 
An increasing number of organizations recognize the potential benefits of grid and related virtualization technologies, and they are rapidly beginning to exploit those benefits.
 
 

The Banker: When the whole is greater than the sum of the parts

 
 
 
When the whole is greater than the sum of the parts
When the whole is greater than the sum of the parts
Published: 04 November, 2004
Grid technology underlies many of the utility models of computing mooted to revolutionise industry. It is being put to use in financial services organisations and in some cases already providing compute power as a utility. Dan Barnes reports.

In the race to come up with the latest collateralised debt obligation (CDO) structure or another kind of derivative, the secret weapon of advantage could be grid computing. Million dollar profits can be earned by banks that reap first mover advantage in derivatives but often they are held back by a lack of raw computing power. Grid computing harnesses unused capacity in a bank’s PCs and servers to get the calculations done rather than relying on existing mainframes or supercomputers. It consolidates and allows a measured co-ordination of computing resource.

 
 

McKinsey: Managing next-generation IT infrastructure

 
 
 
In recent years, companies have worked hard to reduce the cost of the IT infrastructure—the data centers, networks, databases, and software tools that support businesses. These efforts to consolidate, standardize, and streamline assets, technologies, and processes have delivered major savings. Yet even the most effective cost-cutting program eventually hits a wall: the complexity of the infrastructure itself.
 
 

Grid Computing in the Enterprise

 
 
 
February 9, 2004
Grid computing is an overnight success that has been almost four decades in the making.
Last month’s announcement of the WS-Resource framework, enabling grid resource management with standard Web services protocols, completes a convergence that began with the 1965 introduction of the first multiprocessor computer. Libraries full of bleeding-edge research have since paved grids’ way, developing parallel processing schemes to solve exotic and high-value problems.
 
 

Automation know-how

 
 
 
Automation know-how
06 January 2005
Emerging automation tools are making the new data centre more self-reliant than ever.
By Denise Dubie, Network World

The new data centre, with its rapid rate of change and growing complexity, demands software that integrates seamlessly to intelligently automate a range of IT management tasks.

True end-to-end automation in the new data centre would also eliminate the chance that human errors could cause outages or performance problems. Without such over-arching automation, collecting data from multiple sources, making sense of it, putting it into a common format and then knowing what action to take based on business policies will challenge many IT shops in the coming years.

“It will take time, money and know-how about the capabilities available from vendors and those that can be used in-house, but the technology will eventually be available and a culture shift will happen. Automation will mean we can make more services available, at a lower cost, with more accuracy – and that really matters,” says Janice Newell, CIO of Group Health Cooperative in Seattle, USA.

Understanding automation resource by resource is the first step in that process.

 
 

Autonomic attack plans

 
 
 
Autonomic attack plans
08 January 2005
Three vendors are battling head-to-head for mindshare of self-managed systems.

Whenever a new technology or methodology seems poised to shake up enterprise IT, vendors hustle to spin the phenomenon in their direction.

One such concept has been the self-managing data centre, often referred to as “autonomic computing,” the term IBM favours. Here, the goal is to design and implement systems that monitor themselves, repair themselves as necessary, protect themselves from external threats and even re-route their own resources to best meet business needs. (That last factor echoes the promise of another red-hot phenomenon, on-demand – or utility – computing.)

 
 

The new data centre: where are we now?

 
 
 
where are we now?
What is the new data centre?
The new data centre has moved from the conceptual idea it was a year ago to a production infrastructure that today’s early adopters are testing and deploying. As the new data centre evolves, all agree that a long-range plan will be based on two ideas. First, the new data centre relies on a new business model, the extended enterprise, which is in itself the basic building block for yet another emerging business model – the global ecosystem. Second, the basis for the extended enterprise’s (and, eventually, the global ecosystem’s) IT infrastructure will be change management.
 
 

The future of the data centre

 
 
 
The future of the data centre
A revolution is underway in data centre technology. And it is being driven by customers desperate to cut costs and improve responsiveness.

Standards

December 31, 2008

Extensible Messaging and Presence Protocol (XMPP)

 
 
     
 

The Reservoir project – Clouds Interoperability

 
 
 
Resources and Services Virtualization without Barriers is an European Union FP7 funded project that will enable massive scale deployment and management of complex IT services across different administrative domains, IT platforms and geographies. The project will provide a foundation for a service-based online economy, where – using virtualization technologies – resources and services are transparently provisioned and managed on an on-demand basis at competitive costs with high quality of service.
 
 

3Tera “Open Cloud”

 
 
     
 

Open Virtualization Format (OVF)

 
 
     
 

BitTorrent (protocol)

 
 
     
 

SOAP

 
 
     
 

Representational state transfer (REST)

 
 
     
 

Cloud Nine: Specification for a Cloud Computer. A Call to Action.

Open Source

December 31, 2008

Puppet

 
 
 
Put simply, Puppet is a system for automating system administration tasks. To learn more, read our big picture overview of Puppet, or take a deeper look at what Puppet can do with the Puppet Introduction. There’s also an about Puppet page which gives the highlights of Puppet’s functionality.
 
 

Pig

 
 
 
We are creating infrastructure to support ad-hoc analysis of very large data sets. Parallel processing is the name of the game. Our system runs on a cluster computing architecture, on top of which sit several layers of abstraction that ultimately bring the power of parallel computing into the hands of ordinary users. The layers in between automatically translate user queries into efficient parallel evaluation plans, and orchestrate their execution on the raw cluster hardware.
 
 

Nimbus

 
 
 
The University of Chicago Science Cloud, codenamed “Nimbus”, provides compute capability in the form of Xen virtual machines (VMs) that are deployed on physical nodes of the University of Chicago TeraPort cluster (currently 16 nodes) using the workspace service.

Nimbus is available to all members of scientific community wanting to run in the cloud. To obtain access you will need to provide a justification (a few sentences explaining your science project) and a valid grid credential (If you don’t have a credential, email us. We can help). Based on the project, you will be given an allocation on the cloud.

 
 

Hadoop

 
 
     
 

Eucalyptus

 
 
 
EUCALYPTUS – Elastic Utility Computing Architecture for Linking Your Programs To Useful Systems – is an open-source software infrastructure for implementing “cloud computing” on clusters. The current interface to EUCALYPTUS is compatible with Amazon’s EC2 interface, but the infrastructure is designed to support multiple client-side interfaces. EUCALYPTUS is implemented using commonly-available Linux tools and basic Web-service technologies making it easy to install and maintain.
 
 

OpenNebula

 
 
     
 

Enomalism

 
 
 
Enomalism is an open source web-based virtual infrastructure platform. Designed to answer the complexity of managing globally disperse virtual server environments. Enomalism helps to automate the transition to a cloud computing environment by reducing an IT organizations overall workload. The easy to use dashboard can help with issues including deployment planning, load balancing, automatic VM migration, configuration management, capacity diagnosis and resource monitoring/metering.

Gridgain

 
 
 
GridGain is focused on doing one thing – providing the computational grid platform for Java.
 
 

Globus Toolkit

 
 
 
The open source Globus Toolkit is a fundamental enabling technology for the “Grid,” letting people share computing power, databases, and other tools securely online across corporate, institutional, and geographic boundaries without sacrificing local autonomy. The toolkit includes software services and libraries for resource monitoring, discovery, and management, plus security and file management.
 
 

Mosix

 
 
 
MOSIX is a management system that allows a Linux cluster or a Grid of clusters to perform like a single computer with multiple processors. It is particularly suitable to run intensive computing and applications with moderate amounts of I/O.
 
 

Jini

 
 
 
Jini.org is a central place and resource for the Jini CommunitySM. It is a site to discover new information, discuss, collaborate, exchange source code and ideas, and advance Jini™ network technology.
Jini network technology is an open software architecture that enables the creation of network-centric solutions which are highly adaptive to change.
 
 

SUN Grid Engine

 
 
 
The Grid Engine project is an open source community effort to facilitate the adoption of distributed computing solutions. Sponsored by Sun Microsystems and hosted by CollabNet, the Grid Engine project provides enabling distributed resource management software for wide ranging requirements from compute farms to grid computing.
 
 

Unicore

 
 
 
UNICORE (Uniform Interface to Computing Resources) offers a ready-to-run Grid system including client and server software. UNICORE makes distributed computing and data resources available in a seamless and secure way in intranets and the internet.
 
 

Open MPI

 
 
 
A High Performance Message Passing Library
Open MPI is a project combining technologies and resources from several other projects (FT-MPI, LA-MPI, LAM/MPI, and PACX-MPI) in order to build the best MPI library available. A completely new MPI-2 compliant implementation, Open MPI offers advantages for system and software vendors, application developers and computer science researchers.
 
 

OSCAR

 
 
 
OSCAR (Open Source Cluster Application Resources) is a snapshot of the best known methods for building, programming, and using HPC clusters. It consists of a fully integrated and easy to install software bundle designed for high performance cluster computing. Everything needed to install, build, maintain, and use a Linux cluster is included in the suite, making it unnecessary to download or even install any individual software packages on your cluster.
 
 

Xen

 
 
 
Modern computers are sufficiently powerful to use virtualization to present the illusion of many smaller virtual machines (VMs), each running a separate operating system instance. Successful partitioning of a machine to support the concurrent execution of multiple operating systems poses several challenges. Firstly, virtual machines must be isolated from one another: it is not acceptable for the execution of one to adversely affect the performance of another. This is particularly true when virtual machines are owned by mutually untrusting users. Secondly, it is necessary to support a variety of different operating systems to accommodate the heterogeneity of popular applications. Thirdly, the performance overhead introduced by virtualization should be small.
 
 

OGSA-DAI

 
 
 
The aim of the OGSA-DAI project is to develop middleware to assist with access and integration of data from separate sources via the grid. The project was conceived by the UK Database Task Force and is working closely with the Global Grid Forum DAIS-WG, the OMII and the Globus team.
 
 

OpenVZ

 
 
 
OpenVZ is an Operating System-level server virtualization solution, built on Linux. OpenVZ creates isolated, secure virtual private servers (VPSs) or virtual environments on a single physical server enabling better server utilization and ensuring that applications do not conflict. Each VPS performs and executes exactly like a stand-alone server; VPSs can be rebooted independently and have root access, users, IP addresses, memory, processes, files, applications, system libraries and configuration files.
 
 

openQRM

 
 
 
openQRM is designed to deal with all sorts of failures automatically, thus preventing interrupts because of unexpected events.
Implementing openQRM greatly improves the reliability of the x86 data-center.
openQRM is an open source systems management platform which integrates with existing components in enterprise data centers to create scalable, highly available and customizable infrastructures.
 
 

Gridsphere

 
 
 
The GridSphere portal framework provides an open-source portlet based Web portal. GridSphere enables developers to quickly develop and package third-party portlet web applications that can be run and administered within the GridSphere portlet container. Here you will find the GridSphere portal framework available for download and documentation related to the installation and development of portlets using GridSphere.
 
 

GAT – Grid Application Toolkit

 
 
 
The objective of this workpackage is to design and build a Grid Application Toolkit (GAT) and to plug-in to this GAT the services developed in other GridLab workpackages.
GAT is a set of coordinated, generic and flexible APIs for accessing Grid services from e.g. generic application codes, portals, data managements systems, together with working implementations provided by the tools developed in the Grid Lab project (See the figure below). GAT is designed in a modular plug-and-play manner, such that tools developed anywhere can be plugged into GAT.
 
 

Mandriva

 
 
 
Mandriva is a worldwide Linux and Open Source leader providing easy-to-use solutions to individuals and organizations.
 
 

Alchemi

 
 
 
Alchemi is an open source software framework that allows you to painlessly aggregate the computing power of networked machines into a virtual supercomputer (desktop grid) and to develop applications to run on the grid.

It has been designed with the primary goal of being easy to use without sacrificing power and flexibility.

Alchemi includes:
The runtime machinery (Windows executables) to construct computational grids.
A .NET API and tools to develop .NET grid applications and grid-enable legacy applications.

 
 

NGRID

 
 
 
NGrid is an open source (LGPL) grid computing framework written in C#. NGrid aims to be platform independent via the Mono project. NGrid aims to provide

a transparent multithread programming model for grid programming.
a physical grid framework & some grid implementations.
common utilities both for grid programming or grid implementations.

 
 

Crossbow – Network Virtualization and Resource Control

 
 
 
Crossbow provides the building blocks for network virtualization and resource control by virtualizing the stack and NIC around any service (HTTP, HTTPS, FTP, NFS, etc.), protocol or Virtual machine.

Each virtual stack can be assigned its own priority and bandwidth on a shared NIC without causing any performance degradation. The architecture dynamically manages priority and bandwidth resources, and can provide better defense against denial-of-service attacks directed at a particular service or virtual machine by isolating the impact just to that entity. The virtual stacks are separated by means of H/W classification engine such that traffic for one stack does not impact other virtual stacks.

Project Crossbow is next step in the evolution of Solaris networking stack and brings bandwidth resource control and virtualization as part of the architecture itself instead of the usual add-on layers which have heavy overheads and complexity.

 
 

ProActive

 
 
 
ProActive is a GRID middleware (a Java library with Open Source code under LGPL license) for parallel, distributed, and concurrent computing, also featuring mobility and security in a uniform framework. With a reduced set of simple primitives, ProActive provides a comprehensive API to simplify the programming of Grid Computing applications: distributed on Local Area Network (LAN), on clusters of workstations, or on Internet Grids. Portability, Openness, Agility: Write Once, Deploy Everywhere !
 
 

Solr

 
 
 
Solr is an open source enterprise search server based on the Lucene Java search library, with XML/HTTP and JSON APIs, hit highlighting, faceted search, caching, replication, and a web administration interface. It runs in a Java servlet container such as Tomcat.
 
 

Distributed Search for Solr

 
 
     
 

Hadoop

 
 
 
Hadoop is a software platform that lets one easily write and run applications that process vast amounts of data.
 
 

dCache

 
 
     
 

Linux Virtual Server

 
 
     
 

VirtualBox

 
 
 
innotek VirtualBox is a family of powerful x86 virtualization products for enterprise as well as home use. Not only is VirtualBox an extremely feature rich, high performance product for enterprise customers, it is also the only professional solution that is freely available as Open Source Software under the terms of the GNU General Public License (GPL). See “About VirtualBox” for an introduction; see “innotek” for more about our company.
 
 

Talend Open Studio – open source data integration solution

 
 
 
Talend Open Studio, the industry’s first pure open source data integration solution, combines metadata-driven design and execution, with an easy-to-use graphical development environment, to deliver better scalability at a lower total cost of ownership than traditional data integration or Extract, Transform and Load (ETL) solutions. Talend’s technology and business vision shatters the traditional proprietary model and provides the flexibility required to meet the needs of all organizations – regardless of their size, level of expertise or budgetary constraints. To download, please visit

Desktop Virtualization

December 31, 2008

g.ho.st

 
 
     
 

Nomachine

Cloud Services Providers

December 31, 2008

Cloud Products

December 31, 2008

UniCluster Express in Amazon’s Elastic Computing Cloud (EC2)

 
 
 
This paper documents the process of implementing the open source UniCluster Express software in Amazon’s Elastic Computing Cloud (EC2) web service. Published by one of the leading informatics consultancy and HPC systems integrators, the paper presents step by step instructions for setting up a UniCluster Express cluster service within EC2.

http://www.johnmwillis.com/groundwork/cloud-vendors-a-to-z/:

 

Cloud Vendor Level Type Status Based Off Beta Status Notes
3Tera 3 Server Not a Provider (1) Software Based Production 3Tera does provide hosting however their goal is to be a software solution not a hosting solution
Adobe Air 1 Application Not a Provider Backbone TBD Desktop play
Akamai 0 Server Not a Provider Software Based Production CDN
Amazon EC2 2 Server Provider Backbone Beta
Amazon S3 2 Storage Provider Backbone Beta
Amazon SimpleDB 2 Database Provider Backbone Beta
Apache CouchDB 2 Database Not a Provider Software Based Production IBM is involved
Apache Hadoop 2 Database Not a Provider Software Based Production
Areti Internet 0 Application Provider 3Tera Production
Box-Net 1 Storage Provider Backbone Production
Cassatt Corporation 0 Server Not a Provider Software Based Production Provisioning play
Citrix (XenSource) 0 Utility Not a Provider Software Based Production
CohesiveFT 1 Utility Not a Provider Amazon EC2 Beta Supports XEN and VMWare
Dell DCS 2 Server Provider Backbone TBD
Elastra 1 Server Provider Amazon EC2 Beta
EMC Mozy 1 Storage Provider Backbone Production Cloud Services Play
Enki 1 Server Not a Provider 3Tera Production Heavier as a services player
Enomaly 1 Server Not a Provider Amazon EC2 Beta Heavier as a services player
Enomoly ElastcDrive 1 Storage Not a Provider Amazon EC2 Beta
EnterpriseDB 1 Database Not a Provider Amazon EC2 Beta Have a cloud offering
Flexiscale 2 Server Provider Backbone Production UK Based
Fortress ITX 1 Server Not a Provider 3Tera Production
Google Apps 1 Application Provider Backbone Beta Desktop play
HP AiaaS 2 Server Provider Backbone TBD
IBM Blue Cloud 0 Server Provider Backbone TBD Provisioning play
iCloud 1 Application Provider Backbone Production Desktop Cloud
Joyent 2 Server Provider Backbone Production Solaris based cloud
JungleDisk 1 Storage Not a Provider Amazon EC2 Beta Low cost utility for S3
Layered Technology 1 Server Provider 3Tera Production A 3Tera mega partner
LongJump 1 Database Not a Provider Amazon EC2 Beta
Microsoft SSDS 1 Database Provider Backbone TBD Competes w/Amazon SimpleDB
MorphExchange 1 Utility Not a Provider Amazon EC2 Beta Ruby on Rails cloud
Mosso 2 Server Provider Rackspace Production Owned by by Rackspce
Rackspace 0 Server Provider Amazon EC2 Production
Rightscale 1 Server Provider Amazon EC2 Beta
Salesforce.com 0 Application Provider SaaS Production
Sun Caroline 2 Server Provider Backbone TBD
Sun MySQL 1 Database Provider Backbone TBD Not sure of plans
Terremark 0 Server Provider Backbone Production
VMWare 0 Utility Not a Provider Software Based Production

Level Description
0 Cloud Look-Alike
1 Cloud Guests
2 Cloud Hosts
3 Cloud Disruptor

 

Qlayer

 
 
 
Q-layer provides software for data centers that enables true cloud computing. Cloud computing is rapidly changing the computing landscape – by turning data center infrastructure into an agile services delivery platform. New services such as virtual servers, storage or applications are available on-demand to both technical and non-technical users, with usage-based chargeback. By leveraging and augmenting existing data center infrastructures including all popular Hypervisors, Q-layer provides enterprises and service providers with the ability to deliver IT services simply and with dramatically lower TCO.

 

 

Bungee Labs – Platform-as-a-Service

 

 
 

GigaSpaces XAP for Amazon EC2

Eucalyptus (Open Source)

 
 
 
EUCALYPTUS – Elastic Utility Computing Architecture for Linking Your Programs To Useful Systems – is an open-source software infrastructure for implementing “cloud computing” on clusters. The current interface to EUCALYPTUS is compatible with Amazon’s EC2 interface, but the infrastructure is designed to support multiple client-side interfaces. EUCALYPTUS is implemented using commonly-available Linux tools and basic Web-service technologies making it easy to install and maintain.

 

 

Desktone

 
 
 
The Desktone Virtual-D Platform is the only solution that enables desktops to be delivered as an outsourced, subscription service. It lets enterprises quickly realize the full potential and benefits associated with virtual desktop infrastructure (VDI) environments – dramatically reduced deployment complexity; improved management, security and compliance; and customizable end-user experiences – without the capital expense and complicated systems integration of building and deploying a customized internal solution.

 

Cycle Computing

 
 
 
Cycle Computing: Easy Grid Computing with Condor

Kaavo

Hadoop (Open Source)

 

Cohesiveft

 
 
 
Cohesive Flexible Technologies enable customers to build and manage custom applications for virtualized infrastructure. As a result of major trends like SOA, Virtualization and Cloud Computing, the approach to data centers is undergoing rapid change, with major ramifications on cost and business agility. CFT provides on-demand automation solutions to enable this transition in a de-risked, stable way.
Our solution, Elastic Server On-Demand, is a cost-effective, automation solution that enables customers to gain business advantage from new opportunities in data center computing, reducing application infrastructure complexity while increasing agility and customer control. Our Elastic Server On-Demand simplifies the process of creating application stacks for use in virtual environments.

10gen

 

Enomalism (Open Source)

 
 
 
Enomalism is an open source web-based virtual infrastructure platform. Designed to answer the complexity of managing globally disperse virtual server environments. Enomalism helps to automate the transition to a cloud computing environment by reducing an IT organizations overall workload. The easy to use dashboard can help with issues including deployment planning, load balancing, automatic VM migration, configuration management, capacity diagnosis and resource monitoring/metering.

3leafsystems

 

RightScale

 
 
 
RightScale’s automated cloud computing management system helps you create scalable web applications that run on Amazon’s Elastic Compute Cloud. Our advanced auto-scaling and load balancing features ensure your site’s uptime and reliability. The RightScale Dashboard makes it easy to setup, launch, and monitor all of your EC2 and AWS activities.

Hello Cloud!

December 29, 2008

Just a sample line before writing anything….