Introduction
A set of standard tools has to be selected among a wide set of possibilities. What type of database to choose for storing and retrieving Metadata efficiently and scale with large amounts of data?
Requirements
- Free and open source and running under Windows and Linux.
- Capable of handling small binary data (the big stuff is stored on the filesystem) but many features require small amounts of binary data to be stored in a database.
- Fast & scalable enough
- .NET interface
Databases
SQL databases
- PostgreSQL
- MySQL
NoSQL databases
Various non RDBMS systems excluded are the Graph DBs that build a class on their own in my opinion.
Graph Databases
[http://wiki.github.com/tinkerpop/gremlin Gremlin] is a kind of ORM for Graph DBs. Gargamel also seems of interest.
The NoSQL tools allow a great scalability and easy synchronisation (replication) that can be used also for continuous backup purposes.
A MongoDB and CakePHP solution is currently used for the proof of concept.
Interesting links
A collection of the most interesting links I found on the topic.
General
-
Scaling websites (I really liked the 14 rules in the Yahoo presentation)
HBase Vs Cassandra
HBase Vs CouchDB
MongoDB Vs CouchDB
-
[http://www.mongodb.org/display/DOCS/Comparing+Mongo+DB+and+Couch+DB http://www.mongodb.org/display/DOCS/Comparing+Mongo+DB+and+Couch+DB]
-
[http://www.snailinaturtleneck.com/blog/2009/06/29/couchdb-vs-mongodb-benchmark http://www.snailinaturtleneck.com/blog/2009/06/29/couchdb-vs-mongodb-benchmark]
Graph Database
- [http://blog.directededge.com/2009/02/27/on-building-a-stupidly-fast-graph-database On building a stupidly fast graph database]
Architecture of famous websites
ORM
By using a NoSQL solution there is no need of an ORM.
[http://en.wikipedia.org/wiki/Object-Relational_mapping http://en.wikipedia.org/wiki/Object-Relational_mapping]
http://en.wikipedia.org/wiki/List_of_object-relational_mapping_software#.NET
The only ones I know well are NHibernate and Castle ActiveRecord. Therefore I would tend to use them. But it may not be the best solution. I removed the non open source tools.
Microsoft
- ADO.NET Entity Framework, Microsoft's ORM (released with .NET 3.5 SP1)
- Language Integrated Query#LINQ_to_SQL|LINQ to SQL, Free, .Net framework component
- SubSonic, free ORM and code generation tool backed by Microsoft
My favorites
- NHibernate, open source
-
- Castle ActiveRecord, ActiveRecord for .NET, open source
-
- Fluent NHibernate, open source and free
- DataObjects.Net, open source, commercial
Others
- Atlas (software)|Atlas,open source
- Business Logic Toolkit for .NET, open source
- Crystal Mapper, open source
- Developer Express, eXpress Persistent Objects (XPO)
- Euss, open source
- Habanero.NET|Habanero, Free open source Enterprise application framework with a Free Code Generation Tool
- iBATIS, Free open source
- Neo (object-relational toolset)|Neo, open source
- ObjectMapper .NET, GPL and commercial license
- Picasso (software)|Picasso, Open-Source ORM Framework & Code Generator (Relational & Xml), free with Commercial support available
- Telerik OpenAccess ORM|OpenAccess, free or commercial
- TierDeveloper, free ORM and code generation tool
- Sooda, open source; BSD license
- Subsonic (software)|Subsonic, open source
- Logic Data Access, open source