I've been a student at Seneca since 2010 and in all these years that I've been in Seneca I never thought about attending FSOSS, until now, when I had to go for FSOSS as part of my Open Source course requirement. I could not attend all the talks at FSOSS but a couple of them caught my attention because I was already familiar with the topics and had a degree of interest in them. so, I decided to attend those talks in particular. Below is my account of the talks I attended and what I learned from attending FSOSS.
Topics of Interest
I decided to attend the MongoDB talk at FSOSS 2013 because of my newly developed interest in Non SQL databases. Since the beginning of the Fall 2013 I've been working towards getting my MongoDB certification for Java developers from MongoDB University, and I am familiar with the MongoDB concepts. I wanted to find out what new information I could attain from attending the talk on MongoDB. This talk was given by Kevin Cearns. He talked about the advantages of NoSql Databases over Relational Databases and about MongoDB in particular.
Kevin started off by listing out the four types of Nosql databases, which are:
- Key-value stores
- Column based
- Document based (MongoDB)
- Graph based
While there are a number of NoSql databases out in the market currently, MongoDB is the leading Open-Source, NoSql database available. Kevin mentioned that MongoDb is the number 6 best database currently.
MongoDB is a document database written in c++, which is fully horizontally scalable and is very high performance. MongoDb has a flexible schema, which means that the collections are not bound in a fixed document. The collections in MongoDB are stored in documents called BSON or Binary JSON. The structure of a BSON document is same as a JSON document, which means that every collection in Mongo is enclosed in curly brackets just like a JSON document. When saying Mongo has a flexible schema it means that when a collection has, for example 4 fields, data with number of fields other than 4 can be added to the same collection. The collections are not schema bounds like relational databases. Being schema-less also means that, common fields in a document's collection can hold different types of data. For example, a field which is used for Name can hold characters, decimal values, integers etc. This does not work in relational databases because, in relational databases when a field is defined as a particular data type it can only hold values of that data type. This illustrates the flexibility of MongoDB.
MongoDB also has full support of primary and secondary indexing. Indexes support the efficient resolution of queries in MongoDB. Without indexes, MongoDB must scan every document in a collection to select those documents that match the query statement. Fundamentally, indexes in MongoDB are similar to indexes in other database systems. MongoDB defines indexes at the collection level and supports indexes on any field or sub-field of the documents in a MongoDB collection. MongoDB can use indexes to return documents sorted by the index key directly from the index without requiring an additional sort phase.
Sharding is another feature that Mongo supports. It is the process of storing data records across multiple machines and is MongoDB’s approach to meeting the demands of data growth. As the size of the data increases, a single machine may not be sufficient to store the data nor provide an acceptable read and write throughput. Sharding solves the problem with horizontal scaling. With sharding, you add more machines to support data growth and the demands of read and write operations.
Kevin also spoke about Replication. A replica set in MongoDB is a group of mongod processes that maintain the same data set. Replica sets provide redundancy and high availability, and are the basis for all production deployments. Data can live across multiple boxes in multiple servers.
He then went to talk about installing MongoDB and how easy it was to set it up, he showed the installation process by demoing it to the audience. I already have mongodb installed and set up on my local computer so this part was bit of repetitive to me. He went to create a collection called FSOSS and demoed the basic Mongo commands and then went to a lot of detail about mongo and demoed a lot of its functionality, which I thought was pretty cool.
The other talk that interested me was OpenCL. I've taken the GPU programming course in my program last semester, where I learned about CUDA programming concepts in detail and an introduction to OpenCL programming which covered the basic concepts of OpenCL. So, when I heard about the OpenCL talk at FSOSS I just had to sit in. This talk was given by Adrien Guillon.
Adrien started off by talking about Big Data and how the power of GPU programming is very useful in computing big data. He introduced the computational model and how the CPU and GPU sit in the model and how they work together during computations performed by a computer. The talk then lead into what GPU programming is and how it works as opposed to traditional CPU programming. He spoke about the benefits about using GPU programming and how far it has come so far and where he thinks it is going and where he wants GPU programming to go.
The talk did not cover much about OpenCL. He just spoke about, or what it was. He just spoke about how OpenCL is used to build high level abstractions which are very useful in all purposes. He then went on to talk about the issues with OpenCL. He said that OpenCL changes drastically with every release and each release is not very compatible with the previous one. OpenCL does not have a stable release as yet. Adrien spoke about making open source GPU programming stable for the future so in the future programmers will not have to change their code to support every new release.
Both the speakers made it clear that open source is a huge and every growing community. Open source is so wide spread that it covers every aspect of computer programming and development. Kevin, who spoke about MongoDB talked about open source from a database perspective and how open source has given rise to many alternatives to sql databases like the nosql database MongoDb. Adrien, who spoke about open source development from a pure programming perspective talked about how open source has taken the GPU programming world by storm. What linked the two speakers was that both of them talked about Big Data and how these two technologies (MongoDB and OpenCL)can prove very useful in performing huge computations for big data analysis.
My thoughts on the Open Source community
Before taking the Open Source course I did not realise how large the open source community really is. But after attending FSOSS, now I've come to the realisation on how very very large the open source community is. I was always under the impression that the Open Source community focussed mainly on web development, this was probably because I was only familiar with the works of Mozilla. Well, my impression has changed a lot now. Open Source is a huge community working in all fields of programming and development. When someone decided to focus their career on open source, then they really have to make a decision on which field they want to work on.
I am glad that open source is a vastly growing community of very talented developers who create easily available software solutions to people who do not want to be sucked into the whole very expensive licensing processes of large industries like Microsoft and other companies that produce highly licences applications which are very hard to obtain.
Thoughts on FSOSS
My FSOSS experience was sort of 50/50. There were many very interesting talks, but most of the talks were held in the same time slot which prevented many students from attending some rather very interesting talks. Also, most of the talks were very long, some were over an hour long and people like me do not have the attention span to last an hour. But I must say it was a fairly good experience and I might try and make it next time as well.