1. What are the major differences in function and history between memcached and redis? Memcached provides a general memory caching function where it can store small chunks of data in the form of strings and objects [5]. Redis is able to function as a memcached server plus has the ability to create additional data structures in the form of lists, sets, and sorted set where these additional features are not present in memcached [6]. Even though Redis has additional features over memcached, memcached has been been used by industry with success for year and is still the product for memory caching. 2. Why does Facebook still use memcached and MySQL when more modern databases are available? Based on the presentation from Aditya Agarwal of Facebook, Facebook uses memcached and MySQL for many reasons. Memcached - is fast and does in memory hashing very well. They primarily use memcached for query results cache. Facebook has optimized memcached operations to support a variety of functions, UDP, multithreading, new network drivers, etc. to function at the highest possible level. MySql - in Facebook's opinion is very reliable way of storing data. Facebook uses MySql as a key/value database no joins and easy to scale on web tier. Memcached and MySQL provide the best solution for Facebook's data models as of 2010. 3. How and why is replication used for key/value dbs? Replication is implemented in key/value dbs by using a master-slave configuration which allows one to write to the master once, and the slaves is/are updated with the information [1,2]. The master-slave configuration does not restrict the number of salves liked to the master. This is very beneficial if the bulk of requests are read as these can be serviced by many slaves. Based on [2], Redis uses an asynchronous replication which allows non-blocking on the master. This allows the master to facilitate requests when salves are being updated. Karl Seguin [2], briefly illustrates that replication can facilitate protection of one's data by copying the data to many servers. In addition, replication can also serve to increase performance on the end of facilitating read query which can be serviced by any slave. 4. When and why is denormalization good and bad? Denormalization is used extensively in data warehouse. The use of normalized data in data warehousing introduces problems with data retrieval and performance. To address this issue a denormalized star schema was crated to alleviate these issues [3]. The star schema , generally, creates a minimum transitively dependence thus removing atomic values from the table. A denormalized data warehouse provides a better foundation for data retrieval, in my opinion. In many cases writing the same query for a data warehouse is compared to a normalized database is quicker and smaller. A denormalized database is not always the answer even when working large databases. A normalized database minimizes against duplicated data and protects the database from logical problems [3]. A normalized database also assists in isolating users or applications from accessing the same record at the same time via transactions. Transactions assist in keeping data normalized and accurate by locking a record and waiting for the currently to finish before unlocking and allowing another users can access it. 5. What if you want to query the same value by different keys in redis? SINTER will return the members of the set resulting from the intersection of all the given sets. 6. What does the PHP md5 command do for memcached? To my knowledge, the md5 command allows memecached to easily retrieve the key from the hash ring. The md5 command converts a string into a 32bit hash and place's this value in an organized fashion on the hash ring. This allows memcached to quickly retrieve the key and display the information. 7. What is a common partitioning technique for key/value dbs? How does this work? A common partitioning technique for key/value dbs is consistent hashing. The basic idea of consistent hashing is when the hash table is adjusted to accommodate new key/value pair. The hash algorithm will convert a string and return a 32/64bit value, to be stored on an ring so it can be easily retrieved [4]. 8. Create your own example of an application that uses redis that would require a decision of how to do data modeling. Explain the 2 choices and why one is better. Do not use the same example from the lecture or any reading. Simple Blog - Data Model - This application implements the main Blog Post with - Publisher, Blog Title, Blog Body, Blog Comments , and Blog Tags. Also, it stores the Publishre/User ID, Password and Blog ID (Links the Publisher/User to the Blog Post ). blog:id:publisher Who wrote the blog post blog:id:blogtitle Title of the blog blog:id:blogbody Blog contents blog:id:blogcomments Any comments associated with the blog blog:id:blogtags Tags to organize the blog post into categories user:id:name Publisher/Users Name user:id:password Publisher/Users Password user:id:blogid Link the publisher/user to the blog post Basic Lightweight Directory Access Protocol (LDAP) using Redis - I've used different implementation of LDAP with different databases and felt like having and enterprise database was a little over kill. After work with Redis I would think using a Key/Value db would work very well. Below is a basic implementation of this. user:id:name Users Name user:id:password Users Password user:id:title Users Work Title user:id:department Users Department user:id:address Users Work Address user:id:pnumber Users Work Phone Number user:id:admingroup Group assigned to for access level user:id:manager Manager Name group:id:admingroup Name of Admin Group manager:id:name Manager Name manager:id:userid Links Users to Managers Out of the two examples above I would go with my second choice. I feel like this would have a greater impact on LDAP implementations and increase the usability and efficiency of these products. 9. What the important differences between Voldemort and Redis? Based on comparison from Voldemort and Redis, it seems that Voldemort is configured to work automatically with in a distributive environment where Reids is only configured to work on a single thread. Voldemort as a built in fault/tolerant capability as Redis does not. From what I can find a major difference in Voldemort and Redis is Redis has an API for many programming languages where Voldemort is Java only. 10. How exactly does Retwis implement followers in the code (using what redis feature)?. Retwis uses a set function to follow other users. SADD to uid:(user id):followers - gets list of followers SADD to uid:(user id):following - get list of who I'm following () = UserID value ------------------ [1] "The-little-redis-book." GitHub. Accessed September 25, 2013. https://github.com/karlseguin/the-little-redis-book. [2] "Replication." http://redis.io/topics/replication [3] Malaika, Susan, and Matthias Nicola. “Data Normalization Reconsidered, Part 1: The History of Business Records,” n.d. http://www.ibm.com/developerworks/data/library/techarticle/dm-1112normalization/. [4] "Consistent Hashing | Michael Nielsen." Accessed September 28, 2013. http://michaelnielsen.org/blog/consistent-hashing/. [5] "Memcached - a Distributed Memory Object Caching System." Accessed September 29, 2013. http://memcached.org/. [6] "Redis." Accessed September 29, 2013. http://redis.io/.