Jekyll2022-06-11T18:49:00+02:00http://localhost:4000/feed.xmlThomas Roest | Front-end developerThomas Roest, front-end developer, React, redux, javascript, cssImporting and visualizing csv data with Ruby on Rails and Chartkick2017-05-04T00:00:00+02:002017-05-04T00:00:00+02:00http://localhost:4000/2017/05/04/importing-and-visualizing-csv-data-with-rails-and-chartkick<p>Importing csv data is a common and fairly standard task in Rails. Although it might look simple, there are quite a few things to think about when working with csv files. This is a short tutorial covering one approach of importing csv files with Ruby on Rails, and we’ll add some visualization with Chartkick as well. Let’s get started!</p>
<h1 id="1-setup-a-rails-project-to-work-with">1. Setup a rails project to work with.</h1>
<p>First, let’s setup a simple rails app with a Postgres database. (assuming you have Postgres installed). And run the command to create the database.</p>
<figure class="highlight"><pre><code class="language-shell" data-lang="shell">rails new csv_importer <span class="nt">-d</span> postgresql
rake db:create</code></pre></figure>
<p>Further setup includes the Chartkick gem and pry-rails, the latter is optional but very useful when playing around with the rails console.
Add the gems to your gemfile and run bundle install.</p>
<figure class="highlight"><pre><code class="language-ruby" data-lang="ruby"><span class="n">gem</span> <span class="s1">'chartkick'</span>
<span class="n">group</span> <span class="ss">:development</span><span class="p">,</span> <span class="ss">:test</span> <span class="k">do</span>
<span class="n">gem</span> <span class="s1">'pry-rails'</span>
<span class="k">end</span></code></pre></figure>
<h1 id="2-the-dataset">2. The dataset</h1>
<p>Now we have to find some data to work with. We could use a simple, clean csv that we generate ourselves, but where’s the fun in that? Let’s find something a bit more real world like. In this tutorial we will use the <a href="https://www.kaggle.com/deepmatrix/imdb-5000-movie-dataset">IMDB 5000 movie dataset</a> from Kaggle. Kaggle has lot’s of interesting stuff for everyone that wants to experiment with data, including various free datasets.</p>
<p>We could add the file to our Rails root folder, although in production it’s more likely that you will access and parse the file from some url. If you want to follow along with this tutorial, upload the csv to some cloud storage provider (s3, dropbox or others) and make sure the file is publicly accessible.</p>
<h1 id="3-generate-the-movie-model-and-migration">3. Generate the Movie model and migration</h1>
<p>Now that we have our dataset, we have to start thinking about how we want to store the data. For now at least we know that we need a movies table, as each row in the csv file represents a movie.</p>
<p>Generate a movie model and we’ll have a look the migration and required columns later.</p>
<figure class="highlight"><pre><code class="language-shell" data-lang="shell">rails g model Movie</code></pre></figure>
<p>When it comes to deciding what types to use, the first thing to do is look at the information we already have. Have a look at the preview section in the documentation that comes with the dataset on Kaggle. There’s a toggle included that you can use to display the intended data type for each of the columns. This will be our starting point for adding the columns to our migration.</p>
<p>Database types and how to use them is quite a large topic which you should definitely learn about if you haven’t already. <a href="http://stackoverflow.com/questions/17918117/rails-4-list-of-available-datatypes/22725797#22725797">This stackoverflow post</a> might be a good place to start.</p>
<p>You can use the following command in the Rails console to show the data types that your database supports;</p>
<figure class="highlight"><pre><code class="language-ruby" data-lang="ruby"><span class="no">ActiveRecord</span><span class="o">::</span><span class="no">Base</span><span class="p">.</span><span class="nf">connection</span><span class="p">.</span><span class="nf">native_database_types</span><span class="p">.</span><span class="nf">keys</span></code></pre></figure>
<p>If you’re using Postgresql, the list will be considerably longer than Mysql or Sqlite. But to keep things simple, we’ll stick with the few that are mentioned in the column descriptions in the Kaggle documentation (String, Numeric and DateTime).</p>
<p>So what else is important here? One thing we have to consider is making sure that our imported rows are unique. We don’t want duplicate rows in our database when updating or importing the csv more than once. Another reason is (as I found out) is that there are quite a few duplicate records in this dataset that we need to filter out.</p>
<p>The problem is that there is no real unique identifier for any particular row. You could say the movie title should be unique, but that doesn’t take into account remakes of movies with the same title.
One possible solution is to validate the uniqueness of the director name and movie title combination. It’s a reasonable assumption that at least those should be unique.</p>
<p>We can add this constraint by adding a unique index for the director name and movie title. Later on we will add a validation in the code as well.</p>
<p>Keep in mind that there are more things here that could be important such as other indexes, constraints, precision and scale for working with currencies. But for now, this is sufficient.</p>
<figure class="highlight"><pre><code class="language-ruby" data-lang="ruby"><span class="k">class</span> <span class="nc">CreateMovies</span> <span class="o"><</span> <span class="no">ActiveRecord</span><span class="o">::</span><span class="no">Migration</span><span class="p">[</span><span class="mf">5.0</span><span class="p">]</span>
<span class="k">def</span> <span class="nf">change</span>
<span class="n">create_table</span> <span class="ss">:movies</span> <span class="k">do</span> <span class="o">|</span><span class="n">t</span><span class="o">|</span>
<span class="n">t</span><span class="p">.</span><span class="nf">string</span> <span class="ss">:color</span>
<span class="n">t</span><span class="p">.</span><span class="nf">string</span> <span class="ss">:director_name</span>
<span class="n">t</span><span class="p">.</span><span class="nf">decimal</span> <span class="ss">:num_critic_for_reviews</span>
<span class="n">t</span><span class="p">.</span><span class="nf">datetime</span> <span class="ss">:duration</span>
<span class="n">t</span><span class="p">.</span><span class="nf">decimal</span> <span class="ss">:director_facebook_likes</span>
<span class="n">t</span><span class="p">.</span><span class="nf">decimal</span> <span class="ss">:actor_3_facebook_likes</span>
<span class="n">t</span><span class="p">.</span><span class="nf">string</span> <span class="ss">:actor_2_name</span>
<span class="n">t</span><span class="p">.</span><span class="nf">decimal</span> <span class="ss">:actor_1_facebook_likes</span>
<span class="n">t</span><span class="p">.</span><span class="nf">decimal</span> <span class="ss">:gross</span>
<span class="n">t</span><span class="p">.</span><span class="nf">string</span> <span class="ss">:genres</span>
<span class="n">t</span><span class="p">.</span><span class="nf">string</span> <span class="ss">:actor_1_name</span>
<span class="n">t</span><span class="p">.</span><span class="nf">string</span> <span class="ss">:movie_title</span>
<span class="n">t</span><span class="p">.</span><span class="nf">decimal</span> <span class="ss">:num_voted_users</span>
<span class="n">t</span><span class="p">.</span><span class="nf">decimal</span> <span class="ss">:cast_total_facebook_likes</span>
<span class="n">t</span><span class="p">.</span><span class="nf">string</span> <span class="ss">:actor_3_name</span>
<span class="n">t</span><span class="p">.</span><span class="nf">decimal</span> <span class="ss">:facenumber_in_poster</span>
<span class="n">t</span><span class="p">.</span><span class="nf">string</span> <span class="ss">:plot_keywords</span>
<span class="n">t</span><span class="p">.</span><span class="nf">string</span> <span class="ss">:movie_imdb_link</span>
<span class="n">t</span><span class="p">.</span><span class="nf">decimal</span> <span class="ss">:num_user_for_reviews</span>
<span class="n">t</span><span class="p">.</span><span class="nf">string</span> <span class="ss">:language</span>
<span class="n">t</span><span class="p">.</span><span class="nf">string</span> <span class="ss">:country</span>
<span class="n">t</span><span class="p">.</span><span class="nf">string</span> <span class="ss">:content_rating</span>
<span class="n">t</span><span class="p">.</span><span class="nf">decimal</span> <span class="ss">:budget</span>
<span class="n">t</span><span class="p">.</span><span class="nf">decimal</span> <span class="ss">:title_year</span>
<span class="n">t</span><span class="p">.</span><span class="nf">decimal</span> <span class="ss">:actor_2_facebook_likes</span>
<span class="n">t</span><span class="p">.</span><span class="nf">decimal</span> <span class="ss">:imdb_score</span>
<span class="n">t</span><span class="p">.</span><span class="nf">decimal</span> <span class="ss">:aspect_ratio</span>
<span class="n">t</span><span class="p">.</span><span class="nf">decimal</span> <span class="ss">:movie_facebook_likes</span>
<span class="n">t</span><span class="p">.</span><span class="nf">timestamps</span>
<span class="k">end</span>
<span class="n">add_index</span> <span class="ss">:movies</span><span class="p">,</span> <span class="p">[</span><span class="ss">:director_name</span><span class="p">,</span> <span class="ss">:movie_title</span><span class="p">],</span> <span class="ss">unique: </span><span class="kp">true</span>
<span class="k">end</span>
<span class="k">end</span></code></pre></figure>
<p>Run db migrate to create the movies table</p>
<figure class="highlight"><pre><code class="language-shell" data-lang="shell">rails db:migrate</code></pre></figure>
<h1 id="4-create-a-rake-task-to-import-the-data">4. Create a rake task to import the data</h1>
<p>Now that we have our data and database setup we can start with our import script. Let’s generate a rake task to import the data.</p>
<figure class="highlight"><pre><code class="language-shell" data-lang="shell">rails g task csv_import movie_data</code></pre></figure>
<p>Make sure that you have a clear overview of the requirements of your import task. For example; where will the file come from?, file size, frequency of imports, memory usage, logging and more.</p>
<p>In this basic example task, we open and read the file from a url with openURI. This results in a string with the csv contents. We can use the Ruby CSV library to parse the results and convert each row to a hash and saving them to the database.</p>
<p>We want to show some useful output as well, so we’ll keep track of the total amount of rows imported and the number of duplicate records.
If you’re running this in production on a frequent basis, it’s best to add a separate logger for this task. You can find more best practices concerning rake tasks in
<a href="https://edelpero.svbtle.com/everything-you-always-wanted-to-know-about-writing-good-rake-tasks-but-were-afraid-to-ask">this excellent blog post</a></p>
<figure class="highlight"><pre><code class="language-ruby" data-lang="ruby"><span class="nb">require</span> <span class="s1">'csv'</span>
<span class="n">namespace</span> <span class="ss">:csv_import</span> <span class="k">do</span>
<span class="n">desc</span> <span class="s2">"Import movie data"</span>
<span class="n">task</span> <span class="ss">movie_data: :environment</span> <span class="k">do</span>
<span class="n">url</span> <span class="o">=</span> <span class="s1">'https://path-to-your-file.csv'</span>
<span class="n">csv_string</span> <span class="o">=</span> <span class="nb">open</span><span class="p">(</span><span class="n">url</span><span class="p">).</span><span class="nf">read</span>
<span class="n">total_count</span> <span class="o">=</span> <span class="mi">0</span>
<span class="n">duplicate_count</span> <span class="o">=</span> <span class="mi">0</span>
<span class="no">CSV</span><span class="p">.</span><span class="nf">parse</span><span class="p">(</span><span class="n">csv_string</span><span class="p">,</span> <span class="ss">headers: </span><span class="kp">true</span><span class="p">,</span> <span class="ss">header_converters: :symbol</span><span class="p">)</span> <span class="k">do</span> <span class="o">|</span><span class="n">row</span><span class="o">|</span>
<span class="n">movie_hash</span> <span class="o">=</span> <span class="n">row</span><span class="p">.</span><span class="nf">to_hash</span>
<span class="n">movie</span> <span class="o">=</span> <span class="no">Movie</span><span class="p">.</span><span class="nf">create</span><span class="p">(</span><span class="n">movie_hash</span><span class="p">)</span>
<span class="k">if</span> <span class="n">movie</span><span class="p">.</span><span class="nf">persisted?</span>
<span class="n">total_count</span> <span class="o">+=</span> <span class="mi">1</span>
<span class="k">else</span>
<span class="nb">puts</span> <span class="s2">"duplicate: </span><span class="si">#{</span><span class="n">movie</span><span class="p">.</span><span class="nf">director_name</span><span class="si">}</span><span class="s2"> - </span><span class="si">#{</span><span class="n">movie</span><span class="p">.</span><span class="nf">movie_title</span><span class="si">}</span><span class="s2">"</span>
<span class="n">duplicate_count</span> <span class="o">+=</span> <span class="mi">1</span>
<span class="k">end</span>
<span class="k">end</span>
<span class="nb">puts</span> <span class="s2">"Imported </span><span class="si">#{</span><span class="n">total_count</span><span class="si">}</span><span class="s2"> rows, </span><span class="si">#{</span><span class="n">duplicate_count</span><span class="si">}</span><span class="s2"> duplicate rows where not added"</span>
<span class="k">end</span>
<span class="k">end</span></code></pre></figure>
<p>If you would run the task at this moment, Postgres will raise an uniqueness error. This is expected, as it probably encountered a duplicate record within the dataset. But we want to make sure that our tasks continues to run and save the records that are valid. We can do this by adding a validation to the Movie model.</p>
<figure class="highlight"><pre><code class="language-ruby" data-lang="ruby"><span class="k">class</span> <span class="nc">Movie</span> <span class="o"><</span> <span class="no">ApplicationRecord</span>
<span class="n">validates</span> <span class="ss">:movie_title</span><span class="p">,</span> <span class="ss">uniqueness: </span><span class="p">{</span> <span class="ss">scope: :director_name</span> <span class="p">}</span>
<span class="k">end</span></code></pre></figure>
<p>Because we use Movie.create without the ! (bang) operator in our rake task, the validation will silently fail, so our task continues to run.
We use the <code class="highlighter-rouge">persisted?</code> method to check if the record is saved or not.</p>
<p>Now we can run the task again with:</p>
<figure class="highlight"><pre><code class="language-shell" data-lang="shell">rake csv_import:movie_data</code></pre></figure>
<p>Now the task continues to run, and will display the records that are already in the database (based on the unique director and movie title combination).</p>
<p>Have a look at the data in the Rails console by retrieving some records. It should include around 4919 rows. As you can see from our rake task output, there are quite a few duplicate rows in this dataset.</p>
<h1 id="5-visualizing-data-with-chartkick">5. Visualizing data with chartkick</h1>
<p>One the easiest ways to visualize data with with Rails is by using Chartkick. A very useful ruby gem that allows us to generate charts with only a few lines of Ruby. Under the hood, Chartkick can use Chart.js, Google Charts or Highcharts. We’ve already added the Chartkick gem to our gemfile so the only remaining setup is to include Chartkick in application.js.</p>
<p>To setup Chartkick with Chart.js, add chartkick and Chart.bundle to your application.js file.</p>
<figure class="highlight"><pre><code class="language-js" data-lang="js"><span class="c1">//= require jquery</span>
<span class="c1">//= require jquery_ujs</span>
<span class="c1">//= require turbolinks</span>
<span class="c1">//= require Chart.bundle</span>
<span class="c1">//= require chartkick</span>
<span class="c1">//= require_tree .</span></code></pre></figure>
<p>We’ll generate two charts, displaying the following data;</p>
<ol>
<li>
<p>Our first chart will group movies by content rating e.g. pg-13, R. And display the number of movies in each group.</p>
</li>
<li>
<p>Our second chart will display the amount of Facebook likes from each of the 30 highest rated movies (by imdb score).</p>
</li>
</ol>
<p>Start with adding the following scopes to the movie model. The group finder method will group the movies by content rating, count the number of movies in each group and returns a Hash.</p>
<p>The second scope returns the 30 highest rated movies based on imdb_score and will return an ActiveRecord relation collection of objects.</p>
<figure class="highlight"><pre><code class="language-ruby" data-lang="ruby"><span class="k">class</span> <span class="nc">Movie</span> <span class="o"><</span> <span class="no">ApplicationRecord</span>
<span class="n">validates</span> <span class="ss">:movie_title</span><span class="p">,</span> <span class="ss">uniqueness: </span><span class="p">{</span> <span class="ss">scope: :director_name</span> <span class="p">}</span>
<span class="n">scope</span> <span class="ss">:content_rating</span><span class="p">,</span> <span class="o">-></span> <span class="p">{</span> <span class="n">group</span><span class="p">(</span><span class="ss">:content_rating</span><span class="p">).</span><span class="nf">count</span> <span class="p">}</span>
<span class="n">scope</span> <span class="ss">:highest_rated</span><span class="p">,</span> <span class="o">-></span> <span class="p">{</span> <span class="n">order</span><span class="p">(</span><span class="ss">imdb_score: :desc</span><span class="p">).</span><span class="nf">limit</span><span class="p">(</span><span class="mi">30</span><span class="p">)</span> <span class="p">}</span>
<span class="k">end</span></code></pre></figure>
<p>You can experiment with these scopes in the Rails console by using Movie.content_rating and Movie.highest_rated.</p>
<p>Next, we need a controller and index view to display the charts.</p>
<figure class="highlight"><pre><code class="language-shell" data-lang="shell">rails g controller movies index</code></pre></figure>
<p>You can pass data to chartkick in a Hash or Array. Now our first scope (content_rating) already returns a Hash. So we can directly pass that to the view.</p>
<p>Our second scope returns an ActiveRecord relation, which we need to turn into a hash ourselves. A way to do this, is by creating a method that creates an empty hash, merges the selected data for each movie object and returns a hash that we can pass to chartkick.
In this example, I’ve included the movie title and imdb score on the y-axis and the number of facebook likes on the x-axis.</p>
<figure class="highlight"><pre><code class="language-ruby" data-lang="ruby"><span class="k">class</span> <span class="nc">MoviesController</span> <span class="o"><</span> <span class="no">ApplicationController</span>
<span class="k">def</span> <span class="nf">index</span>
<span class="vi">@content_rating</span> <span class="o">=</span> <span class="no">Movie</span><span class="p">.</span><span class="nf">content_rating</span>
<span class="vi">@movies_with_fb_likes</span> <span class="o">=</span> <span class="n">convert_to_hash</span><span class="p">(</span><span class="no">Movie</span><span class="p">.</span><span class="nf">highest_rated</span><span class="p">)</span>
<span class="k">end</span>
<span class="kp">private</span>
<span class="k">def</span> <span class="nf">convert_to_hash</span><span class="p">(</span><span class="n">relation</span><span class="p">)</span>
<span class="n">chart_data</span> <span class="o">=</span> <span class="p">{}</span>
<span class="n">relation</span><span class="p">.</span><span class="nf">each</span> <span class="k">do</span> <span class="o">|</span><span class="n">movie</span><span class="o">|</span>
<span class="n">chart_data</span><span class="p">.</span><span class="nf">merge!</span><span class="p">(</span><span class="s2">"</span><span class="si">#{</span><span class="n">movie</span><span class="p">.</span><span class="nf">movie_title</span><span class="si">}</span><span class="s2"> </span><span class="si">#{</span><span class="n">movie</span><span class="p">.</span><span class="nf">imdb_score</span><span class="si">}</span><span class="s2">"</span> <span class="o">=></span> <span class="n">movie</span><span class="p">.</span><span class="nf">movie_facebook_likes</span><span class="p">)</span>
<span class="k">end</span>
<span class="n">chart_data</span>
<span class="k">end</span>
<span class="k">end</span></code></pre></figure>
<p>And finally, to generate the charts with the data, add the following lines to the index view.</p>
<figure class="highlight"><pre><code class="language-html" data-lang="html"><span class="nt"><</span><span class="err">%=</span> <span class="na">column_chart</span> <span class="err">@</span><span class="na">content_rating</span><span class="err">,</span> <span class="na">width:</span> <span class="err">"</span><span class="na">600px</span><span class="err">"</span> <span class="err">%</span><span class="nt">></span>
<span class="nt"><</span><span class="err">%=</span> <span class="na">bar_chart</span> <span class="err">@</span><span class="na">movies_with_fb_likes</span><span class="err">,</span> <span class="na">height:</span> <span class="err">"</span><span class="na">600px</span><span class="err">",</span> <span class="na">width:</span> <span class="err">"</span><span class="na">600px</span><span class="err">"</span> <span class="err">%</span><span class="nt">></span></code></pre></figure>
<p>That’s it! Only one line of ruby in our view. Need a line chart instead? Just change it to;</p>
<figure class="highlight"><pre><code class="language-html" data-lang="html"><span class="nt"><</span><span class="err">%=</span> <span class="na">line_chart</span> <span class="err">@</span><span class="na">content_rating</span><span class="err">,</span> <span class="na">width:</span> <span class="err">"</span><span class="na">600px</span><span class="err">"</span> <span class="err">%</span><span class="nt">></span></code></pre></figure>
<p>Check out <a href="http://chartkick.com/">the Chartkick docs</a> and do some experimenting with different charts and options!</p>
<h1 id="other-useful-sources">Other useful sources</h1>
<ol>
<li>
<p><a href="http://dalibornasevic.com/posts/68-processing-large-csv-files-with-ruby">http://dalibornasevic.com/posts/68-processing-large-csv-files-with-ruby</a></p>
</li>
<li>
<p><a href="https://edelpero.svbtle.com/everything-you-always-wanted-to-know-about-writing-good-rake-tasks-but-were-afraid-to-ask">https://edelpero.svbtle.com/everything-you-always-wanted-to-know-about-writing-good-rake-tasks-but-were-afraid-to-ask</a></p>
</li>
</ol>
<hr />
<p><strong><em>Liked this post? Feedback or Questions? Let me know by <a href="https://medium.com/@thomasroest/importing-and-visualizing-csv-data-with-ruby-on-rails-and-chartkick-ddd2a1025ebc">liking or commenting on Medium</a></em></strong></p>Importing csv data is a common and fairly standard task in Rails. Although it might look simple, there are quite a few things to think about when working with csv files. This is a short tutorial covering one approach of importing csv files with Ruby on Rails, and we’ll add some visualization with Chartkick as well. Let’s get started!Hosting static sites with Amazon S3 and Cloudflare2017-03-25T00:00:00+01:002017-03-25T00:00:00+01:00http://localhost:4000/2017/03/25/hosting-static-sites-with-amazon-s3-and-cloudflare<p>Last year I wrote about <a href="https://thomasroest.com/2016/11/05/set-up-a-jekyll-site-on-a-vps-with-ubuntu-nginx-and-letsencrypt.html">hosting a static site on a vps</a>. This works, but I wouldn’t recommend this approach anymore if you just want to host a simple static site. Managing your own server should be avoided if not necessary. Setting up our static site on Amazon S3 with Cloudflare allows for a cheaper setup with minimal configuration and several benefits provided by Cloudflare.</p>
<h1 id="1-create-an-amazon-s3-bucket">1. Create an Amazon s3 bucket</h1>
<p>Start with creating a bucket on <a href="https://aws.amazon.com/s3/">Amazon S3</a> with the name of your desired domain (examplesite.com) and select a region near you.</p>
<h1 id="2-upload-your-site-files">2. Upload your site files</h1>
<p>You can create your own static site files yourself or you can use a generator like <a href="https://jekyllrb.com/">Jekyll</a> or <a href="http://gohugo.io">Hugo</a>. When you’ve created the build, upload the contents of your build/public folder to your bucket. In the new Amazon S3 interface, you can upload by dragging and dropping multiple files at once.</p>
<h1 id="3-set-permissons">3. Set permissons</h1>
<p>To use web hosting from your bucket, the items in the bucket should be publicly readable.</p>
<p>Go to your bucket -> permissions tab -> Bucket policy and add the JSON snippet which you can find <a href="https://docs.aws.amazon.com/AmazonS3/latest/dev/WebsiteAccessPermissionsReqd.html">here</a> (replace the example bucket resource with your bucket name) and save the configuration.</p>
<p><code class="highlighter-rouge">"Resource":["arn:aws:s3:::yourdomain.com/*"</code></p>
<h1 id="4-enable-webhosting-for-your-bucket">4. Enable webhosting for your bucket</h1>
<p>Go to the properties tab -> static website hosting and select the option ‘Use this bucket to host a website’. Set the index and error documents and save your configuration. This will make your site available on the specified endpoint url.</p>
<h1 id="5-custom-domains-and-ssl-with-cloudflare">5. Custom domains and SSL with Cloudflare</h1>
<p>We can use Cloudflare to use our custom domain and to benefit from it’s ssl and CDN options. Sign up for <a href="https://www.cloudflare.com/">Cloudflare</a> and go through the ‘add site’ steps. Add your root domain <code class="highlighter-rouge">example.com</code> and begin the scan. Depending on your dns provider, Cloudflare will find a bunch of records that already exist. If there are any A records found you can delete those.</p>
<p>What we need is one CNAME record with name: @, content: your bucket site endpoint (example <code class="highlighter-rouge">yourbucketname.com.s3-website-eu-west-1.amazonaws.com</code>).</p>
<p>Continue with the setup and select the free plan. The following step is to update your name servers. Go to your domain registrar and change the name servers of your domain to the ones provided by Cloudflare.</p>
<p>This process should result in a ‘pending’ status in your Cloudflare dashboard. Recheck your name servers to check if your domain is active. When your status changes to active, your site should be available on the custom url.</p>
<h1 id="6-ssl">6. SSL</h1>
<p>In your Cloudflare dashboard go to crypto –> ssl and set SSL to flexible. Again, it can take some time for the certificate to be issued.
We also want to make sure that http traffic is redirected to https. We can do this by adding a page rule in the ‘page rules’ section.</p>
<p>Add a new rule for your domain with <code class="highlighter-rouge">http://example.com/*</code> and add the ‘always use https setting’.</p>Last year I wrote about hosting a static site on a vps. This works, but I wouldn’t recommend this approach anymore if you just want to host a simple static site. Managing your own server should be avoided if not necessary. Setting up our static site on Amazon S3 with Cloudflare allows for a cheaper setup with minimal configuration and several benefits provided by Cloudflare.Properly setting up Redis and Sidekiq in production on Ubuntu 16.042017-03-04T00:00:00+01:002017-03-04T00:00:00+01:00http://localhost:4000/2017/03/04/properly-setting-up-redis-and-sidekiq-in-production-ubuntu-16-04<p>Properly setting up background job processing in production involves quite a few steps. This is an up-to-date, high-level overview of the steps involved in setting up Redis and Sidekiq on the latest version of Ubuntu (16.04). Details of this configuration may change, so I won’t be explaining everything step by step and we will use official documentation wherever possible.</p>
<p>This article includes the following;</p>
<ol>
<li>Installing and configuring Redis</li>
<li>Security basics</li>
<li>Installing Sidekiq</li>
<li>Running Sidekiq with Systemd</li>
<li>Deployment with Capistrano</li>
</ol>
<h2 id="1-installing-and-configuring-redis">1. Installing and configuring Redis</h2>
<p>We start with Redis, the in-memory data structure store that Sidekiq uses to process background jobs. To install Redis, we will follow the quickstart guide from the official documentation.</p>
<p><a href="https://redis.io/topics/quickstart">the Redis quickstart guide</a></p>
<p>To install Redis you can either use apt-get or install from source.
The recommended way (following the quickstart guide) is compiling it from source with the following commands (using a Linux package manager is discouraged, although <a href="https://news.ycombinator.com/item?id=13906567">not everyone agrees</a>).</p>
<figure class="highlight"><pre><code class="language-ruby" data-lang="ruby"> <span class="n">wget</span> <span class="n">http</span><span class="ss">:/</span><span class="o">/</span><span class="n">download</span><span class="p">.</span><span class="nf">redis</span><span class="p">.</span><span class="nf">io</span><span class="o">/</span><span class="n">redis</span><span class="o">-</span><span class="n">stable</span><span class="p">.</span><span class="nf">tar</span><span class="p">.</span><span class="nf">gz</span>
<span class="n">tar</span> <span class="n">xvzf</span> <span class="n">redis</span><span class="o">-</span><span class="n">stable</span><span class="p">.</span><span class="nf">tar</span><span class="p">.</span><span class="nf">gz</span>
<span class="n">cd</span> <span class="n">redis</span><span class="o">-</span><span class="n">stable</span>
<span class="n">make</span></code></pre></figure>
<p>This includes <code class="highlighter-rouge">redis-server</code> and <code class="highlighter-rouge">redis-cli</code> which we will be using later on in this tutorial. In the <code class="highlighter-rouge">redis-stable</code> folder, you can run the following commands to copy the executables to you local usr bin folder so you can run the redis-server and cli commands from your home / user directory.</p>
<figure class="highlight"><pre><code class="language-ruby" data-lang="ruby"><span class="n">sudo</span> <span class="n">cp</span> <span class="n">src</span><span class="o">/</span><span class="n">redis</span><span class="o">-</span><span class="n">server</span> <span class="sr">/usr/</span><span class="n">local</span><span class="o">/</span><span class="n">bin</span><span class="o">/</span>
<span class="n">sudo</span> <span class="n">cp</span> <span class="n">src</span><span class="o">/</span><span class="n">redis</span><span class="o">-</span><span class="n">cli</span> <span class="sr">/usr/</span><span class="n">local</span><span class="o">/</span><span class="n">bin</span><span class="o">/</span></code></pre></figure>
<p>You can start and test Redis with the <code class="highlighter-rouge">redis-server</code> and <code class="highlighter-rouge">redis-cli</code> commands as explained in the quickstart guide. This is fine for testing purposes. However, this is not the recommended way to run Redis in a production environment.</p>
<p><a href="https://redis.io/topics/quickstart">The Redis quickstart guide</a> has a section called ‘Installing Redis more properly’. This setup involves using an init script and configuration file to set up a Redis instance on a specified port. This approach is ‘strongly recommended’. The steps to properly install Redis are clearly explained in the quickstart guide and I recommend to read and follow these steps before continuing.</p>
<h2 id="2-security">2. Security</h2>
<p>Redis is not optimized for security by default. So we have to take some measures to keep our Redis instance secure. You can start with reading the ‘Securing Redis’ section in the quickstart guide. There are four recommendations listed, we will focus on the first two.</p>
<p><strong>1 . Make sure the port Redis uses to listen for connections is firewalled.</strong></p>
<p>The first recommendation is to set up a firewall to limit access to our server and Redis instance. We can do this by configuring UFW (Uncomplicated Firewall), ufw is installed on Ubuntu by default.</p>
<p>To add basic firewall security, follow the first 5 steps in the following guide (up until Specific Port Ranges). This includes adding default policies for connections, allowing ssh and only allowing http and https connections.</p>
<p><a href="https://www.digitalocean.com/community/tutorials/how-to-set-up-a-firewall-with-ufw-on-ubuntu-16-04">How to setup a firewall with ufw on Ubuntu 16.04</a></p>
<p><strong>2 . Binding to localhost and protected mode</strong></p>
<p>If you followed the steps in ‘Installing Redis properly’, this should already be configured by default. Open your Redis instance configuration file in <code class="highlighter-rouge">/etc/redis/6379.conf</code>. The bind directive is the uncommented line with bind 127.0.0.1. This will make sure you can only access Redis locally. Another security layer added by default is ‘protected mode’. You should not disable this unless your requirements include the option for other hosts to connect to Redis.</p>
<p>Step 3 & 4 are optional and can be added depending on your applications requirements.</p>
<h2 id="3-installing-sidekiq">3. Installing Sidekiq</h2>
<p>Install Sidekiq by adding it to your gemfile and deploy (or using gem install Sidekiq). By default, a configuration file is not required. If you do want advanced options you can add one, see this section in the <a href="https://github.com/mperham/sidekiq/wiki/Advanced-Options">Sidekiq wiki</a>.
Later on, we can specify the path to this configuration file when we manage starting Sidekiq with Systemd.</p>
<p>By default Redis uses port 6379. This is also the default port Sidekiq tries to connect to. If you’re using a different port, you need to specify it in your application by using an initializer for example, see <a href="https://github.com/mperham/sidekiq/wiki/Using-Redis">using Redis in the sidekiq wiki </a> for more information.</p>
<h2 id="4-running-sidekiq-in-production-using-systemd">4. Running Sidekiq in production using Systemd</h2>
<p>In the <a href="https://github.com/mperham/sidekiq/wiki/Deploying-to-Ubuntu">official documentation</a>, Upstart is recommended for controlling your Sidekiq processes. However, in recent versions of Ubuntu, Upstart has been replaced with Systemd. There’s has been a lot of debate in Linux land on whether this is a good thing or not, but since Ubuntu now comes with Systemd by default, that’s what we’ll be using.</p>
<p>Like Upstart, Systemd is an init service which handles starting of tasks and services during boot, stopping them during shutdown and supervising them while the system is running. Check out <a href="https://www.digitalocean.com/community/tutorials/systemd-essentials-working-with-services-units-and-the-journal">this article</a> if you want to read more about Systemd.</p>
<p>To setup a Sidekiq Systemd service, we need a service configuration file. An example file can be found in the <a href="https://github.com/mperham/sidekiq/blob/master/examples/systemd/sidekiq.service">sidekiq github repo</a>. Copy this file to your server and place it in
<code class="highlighter-rouge">/lib/systemd/system</code></p>
<p>There are two lines here that require adjustment to your settings;</p>
<ol>
<li>
<p>The working directory path, change this to you application path, for example:
<code class="highlighter-rouge">WorkingDirectory=/home/deploy/my_app/current</code></p>
</li>
<li>
<p>The ExecStart path. This specifies the path and command to start Sidekiq. Now this could be different depending on your settings and ruby version managers (if you use one).
For example, my current ExecStart path is:<br />
<code class="highlighter-rouge">ExecStart=/home/deploy/.rbenv/shims/bundle exec sidekiq -e production</code></p>
</li>
</ol>
<p>You can add additional options to the ExecStart path like the location of you configuration file with and <code class="highlighter-rouge">-C config/sidekiq.yml</code> or logs with <code class="highlighter-rouge">-L log/sidekiq.log</code>. These are both optional. Sidekiq runs on default settings if you don’t specify a configuration file and the default logging is set to syslog in <code class="highlighter-rouge">/var/log/syslog</code>.</p>
<p>After setting up configuration, enable the Sidekiq service with:</p>
<p><code class="highlighter-rouge">systemctl enable sidekiq</code></p>
<p>other useful commands for controlling the service are:</p>
<p><code class="highlighter-rouge">systemctl {start,stop,status,restart} sidekiq</code></p>
<p>or</p>
<p><code class="highlighter-rouge">service sidekiq {start,stop,status,restart} </code></p>
<h2 id="5-deploying">5. Deploying</h2>
<p>If you use Capistrano for deployment, you can make restarting Sidekiq part of the deployment process. You can find the code example on the <a href="https://github.com/mperham/sidekiq/wiki/Deploying-to-Ubuntu#using-capistrano">Sidekiq repo wiki</a></p>
<p>The only difference here is that instead of Upstart, we use Systemd. So the sidekiq:restart task should be changed to:</p>
<figure class="highlight"><pre><code class="language-ruby" data-lang="ruby"> <span class="n">task</span> <span class="ss">:restart</span> <span class="k">do</span>
<span class="n">on</span> <span class="n">roles</span><span class="p">(</span><span class="ss">:app</span><span class="p">)</span> <span class="k">do</span>
<span class="n">execute</span> <span class="ss">:sudo</span><span class="p">,</span> <span class="ss">:systemctl</span><span class="p">,</span> <span class="ss">:restart</span><span class="p">,</span> <span class="ss">:sidekiq</span>
<span class="k">end</span>
<span class="k">end</span></code></pre></figure>
<p>Executing sudo commands with Capistrano does require additional configuration on the server. The ‘deploy’ user has to be able to execute this specific sudo command without a password prompt. With the following configuration you can setup passwordless sudo for the systemctl restart Sidekiq task.</p>
<p>Open the sudoers file on your server with <code class="highlighter-rouge">sudo visudo</code>. On Ubuntu, this will open the file with the nano text editor. To allow passwordless sudo for our Sidekiq restart task, add this to the bottom of the file and save /exit with ctrl-x.</p>
<p><code class="highlighter-rouge">%deploy ALL=NOPASSWD:/bin/systemctl restart sidekiq</code></p>
<p>As always, be cautious when setting sudo privileges and using visudo. You can read more about the Sudoers file and visudo command <a href="https://www.digitalocean.com/community/tutorials/how-to-edit-the-sudoers-file-on-ubuntu-and-centos">here.</a></p>
<h2 id="whats-next">What’s next</h2>
<p>That’s it for now! As you can see setting this up properly takes quite a few steps. Hopefully this made it somewhat more manageable. Next up is Monitoring, an important part of a robust Redis/Sidekiq setup. Coming soon!</p>Properly setting up background job processing in production involves quite a few steps. This is an up-to-date, high-level overview of the steps involved in setting up Redis and Sidekiq on the latest version of Ubuntu (16.04). Details of this configuration may change, so I won’t be explaining everything step by step and we will use official documentation wherever possible.Modern JavaScript & Ruby on Rails with rails/webpacker2017-01-02T00:00:00+01:002017-01-02T00:00:00+01:00http://localhost:4000/2017/01/02/modern-javascript-and-ruby-on-rails-with-rails-webpacker<p>There’s no denying that JavaScript has evolved a lot in the past years, and for Rails, keeping up has been a challenge. The result is a large variety of approaches to use modern js, es6/7 and frameworks with Rails, including the ‘javascript in a gem’ approach and the use of bundlers like Browserify and Webpack.</p>
<p>From the various bundler options, it seems that Webpack has emerged as the most popular one. For Rails, Webpack seems to be the way forward as well, and support to use it will be integrated into Rails 5.1.</p>
<blockquote class="twitter-tweet" data-lang="nl"><p lang="en" dir="ltr"><a href="https://twitter.com/travisdmathis">@travisdmathis</a> <a href="https://twitter.com/hackteck">@hackteck</a> You'll be happy with Rails 5.1 then! Shipping with Yarn by default and --webpack option 👍</p>— DHH (@dhh) <a href="https://twitter.com/dhh/status/812741436747620352">24 december 2016</a></blockquote>
<script async="" src="//platform.twitter.com/widgets.js" charset="utf-8"></script>
<p>This integration is currently being developed as <a href="https://github.com/rails/webpacker">rails/webpacker</a>. The gem includes installers and configuration files that will make it easier to get started with Webpack. It comes with defaults like Babel for es6/7 and a React installer. Setup is fast, and requires little Webpack experience to get started. Rails/webpacker is likely to ship with Rails 5.1, but you can already start using it today.</p>
<!-- <script src="https://gist.github.com/ThomasRoest/d415d54bb81ade76c61dec7a445f65ba.js"></script> -->
<h2 id="getting-started">Getting started</h2>
<p>Let’s start by creating an example app using the latest rails version, at the moment this is 5.0.1</p>
<p><code class="highlighter-rouge">rails new webpacker_test</code></p>
<p><code class="highlighter-rouge">rails g controller static_pages home</code></p>
<p>and in our routes
<code class="highlighter-rouge">root to: 'static_pages#home'</code></p>
<p>Another prerequisite is that we install <a href="https://github.com/yarnpkg/yarn">Yarn</a>, a JavaScript package manager. You can install Yarn with homebrew if you haven’t used it before.</p>
<p><code class="highlighter-rouge">brew install yarn</code></p>
<h2 id="install-webpacker">Install webpacker</h2>
<p>We want the latest webpacker gem version from github, so add the following to your gemfile and run bundle install:</p>
<p><code class="highlighter-rouge">gem 'webpacker', github: 'rails/webpacker'</code></p>
<p>To install webpack, dependencies and to generate configuration files:</p>
<p><code class="highlighter-rouge">bin/rails webpacker:install</code></p>
<p>This generates the following:</p>
<ul>
<li>binstubs for yarn, webpack and webpack watcher</li>
<li>app/javascript folder</li>
<li>config/webpack</li>
<li>node_modules, package.json and yarn.lock in /vendor</li>
</ul>
<p>The default dependencies and configuration include Babel, this allows you to use the latest JavaScript syntax right away.</p>
<h2 id="usage">Usage</h2>
<p>In the app/javascript directory you can add ‘packs’ with bundled JavaScript files. You can add the bundle to your desired layout with the javascript_pack_tag. To start using it, add <code class="highlighter-rouge"><%= javascript_pack_tag 'application' %></code> to application.html.erb.</p>
<p>Let’s run our rails app with <code class="highlighter-rouge">bin/rails server</code>, and in a different terminal window run <code class="highlighter-rouge">bin/webpack-watcher</code>.</p>
<p>Let’s try it out by creating a new file in app/javascript/my_file.js, including an es6 class.</p>
<figure class="highlight"><pre><code class="language-javascript" data-lang="javascript"><span class="kd">class</span> <span class="nx">User</span> <span class="p">{</span>
<span class="kd">constructor</span><span class="p">(</span><span class="nx">name</span><span class="p">,</span> <span class="nx">email</span><span class="p">)</span> <span class="p">{</span>
<span class="k">this</span><span class="p">.</span><span class="nx">name</span> <span class="o">=</span> <span class="nx">name</span><span class="p">;</span>
<span class="k">this</span><span class="p">.</span><span class="nx">email</span> <span class="o">=</span> <span class="nx">email</span><span class="p">;</span>
<span class="p">}</span>
<span class="p">}</span>
<span class="kd">const</span> <span class="nx">user</span> <span class="o">=</span> <span class="k">new</span> <span class="nx">User</span><span class="p">(</span><span class="s1">'example_user'</span><span class="p">,</span> <span class="s1">'hello@example.com'</span><span class="p">);</span>
<span class="nx">console</span><span class="p">.</span><span class="nx">log</span><span class="p">(</span><span class="s2">`name: </span><span class="p">${</span><span class="nx">user</span><span class="p">.</span><span class="nx">name</span><span class="p">}</span><span class="s2">, email: </span><span class="p">${</span><span class="nx">user</span><span class="p">.</span><span class="nx">email</span><span class="p">}</span><span class="s2">`</span><span class="p">);</span></code></pre></figure>
<p>Add the file to your application.js pack with <code class="highlighter-rouge">require(my_file)</code>.
View the source code in the browser to see the bundled Webpack file with the transpiled code.</p>
<h2 id="adding-react">Adding React</h2>
<p>Rails webpacker comes with a React installer that you can use to install React and its dependencies with a single command.
The command <code class="highlighter-rouge">bin/rails webpacker:install:react</code> updates vendor/package.json, yarn.lock, config/webpack and generates a new react pack in app/javascript/packs</p>
<p>Add the React pack to your desired layout file and restart <code class="highlighter-rouge">bin/webpack-watcher</code> to start using it.</p>
<h2 id="adding-packages-with-yarn">Adding packages with yarn</h2>
<p>To manually install new packages, make sure you use <code class="highlighter-rouge">bin/yarn add new_package</code>. Using bin/yarn will make sure that the right package.json and yarn lockfile in app/vendor are updated.</p>
<h2 id="deployment">Deployment</h2>
<p>To compile your js packs for deployment, add the <code class="highlighter-rouge">rails webpacker:compile</code> command to your deployment process. This will setup production configuration, including digests (similar to the asset pipeline).</p>
<h3 id="other-resources">Other resources</h3>
<ul>
<li>
<p>An overview of JavaScript topics to learn in 2017. <a href="https://medium.com/javascript-scene/top-javascript-frameworks-topics-to-learn-in-2017-700a397b711#.jghs7mylt">Top javascript frameworks and topcics to learn in 2017</a></p>
</li>
<li>
<p>A free course using vanilla js/es6, useful if you want to move away from jQuery. <a href="https://github.com/wesbos/JavaScript30">https://github.com/wesbos/JavaScript30</a></p>
</li>
<li>
<p>Understanding es6 by Nicholas Zakas
<a href="https://leanpub.com/understandinges6/read">https://leanpub.com/understandinges6/read</a></p>
</li>
<li>
<p>Webpack docs
<a href="https://webpack.github.io/docs/">https://webpack.github.io/docs/</a></p>
</li>
</ul>There’s no denying that JavaScript has evolved a lot in the past years, and for Rails, keeping up has been a challenge. The result is a large variety of approaches to use modern js, es6/7 and frameworks with Rails, including the ‘javascript in a gem’ approach and the use of bundlers like Browserify and Webpack.Introduction to Algorithms with Ruby2016-12-12T00:00:00+01:002016-12-12T00:00:00+01:00http://localhost:4000/2016/12/12/introduction-to-algorithms-with-ruby<p>There are many reasons why you should learn about algorithms. They are a fundamental part of computer science and they play a key role in modern technological innovation. Besides that, learning basic algorithms is a good way to develop your problem solving skills and to better understand the software and programming languages you work with on a daily basis.</p>
<p>So let’s say you’re a self taught Ruby developer and you want to improve your computer science fundamentals by learning some algorithms. Where do you start? The amount of theory available can be quite overwhelming. This guide is my attempt to narrow it down to a short introduction that you can start with right now, and you don’t even have to learn C or Java first.</p>
<h2 id="approach">Approach</h2>
<p>So what is an algorithm anyway?</p>
<blockquote>
<p>An algorithm is a plan, a logical step-by-step process for solving a problem.</p>
</blockquote>
<p>By this definition, any method/function can be an algorithm. However, algorithms are usually a bit more complex than your average function and often involve multiple steps to solve a problem. Anyway, the best way to learn what an algorithm is, is to implement one yourself.</p>
<p>Before doing so, we need to know about the required steps involved in our approach to solving problems.</p>
<ol>
<li>First, define a high level description of the problem, the input and desired output.</li>
<li>Second, break down the solution with <a href="http://users.csc.calpoly.edu/~jdalbey/SWE/pdl_std.html">pseudocode</a>, a form of structured English that resembles a programming language.</li>
<li>Convert your pseudocode to a working implementation in Ruby.</li>
</ol>
<p>After implementing your algorithm, try to keep improving it. Can it be done more efficiently? Can you refactor or rename things?</p>
<h2 id="selection-sort">Selection sort</h2>
<p>In this guide, we take two (relatively) simple algorithms in two of the most fundamental challenges in computer science: sorting and searching. The algorithms we use (selection sort and binary search) are not the fastest or most efficient, but they are a good starting point if you’ve never worked with algorithms before.</p>
<p>Let’s get started by following the steps in the process:</p>
<h4 id="1-problem-description-input-and-output">1. Problem description, input and output.</h4>
<p>There are two ways to do this. One is to find a high level description of the selection sort problem, and implement the code yourself. However, if you don’t have any experience with algorithms, I recommend watching a tutorial that explains the algorithm step by step. Let’s head over to the <a href="https://www.youtube.com/channel/UClEEsT7DkdVO_fkrBw0OTrA">Mycodeschool</a> Youtube channel for a great explanation of selection sort.</p>
<p><a href="http://www.youtube.com/watch?feature=player_embedded&v=GUDLRan2DWM&list=PL2_aWCzGMAwKedT2KfDMB9YA5DgASZb3U&index=2 " target="_blank"><img src="https://img.youtube.com/vi/GUDLRan2DWM/0.jpg" alt="IMAGE ALT TEXT HERE" width="240" height="180" border="5" /></a></p>
<p>Based on the explanation in the video, we can define the following problem description.</p>
<p>Given an array of integers, the selection sort algorithm sorts by iterating over each number in the array and then swapping it with the sorted position of the element. We keep track of the integers that are sorted by using a minimum element. Our algorithm should return a sorted array.</p>
<h4 id="2-breaking-down-the-problem-with-pseudocode">2. Breaking down the problem with pseudocode</h4>
<p>I recommend converting the pseudo code from the video to your own pseudocode. Try to understand each of the steps and add comments for extra clarification.</p>
<figure class="highlight"><pre><code class="language-ruby" data-lang="ruby"><span class="c1">#function with the array and the number of elements as arguments</span>
<span class="n">selection</span> <span class="n">sort</span><span class="p">(</span><span class="n">a</span><span class="p">,</span> <span class="n">n</span><span class="p">)</span>
<span class="c1">#main loop iterating over the array (with exception of the last element)</span>
<span class="k">for</span> <span class="n">i</span> <span class="n">to</span> <span class="n">n</span> <span class="o">-</span> <span class="mi">2</span>
<span class="c1">#storing the minimum element starting with the the first element</span>
<span class="n">index_minimum</span> <span class="o">=</span> <span class="n">i</span>
<span class="c1">#scanning the array and comparing elements with the current minimum</span>
<span class="k">for</span> <span class="n">j</span> <span class="n">i</span> <span class="o">+</span> <span class="mi">1</span> <span class="n">to</span> <span class="n">n</span> <span class="o">-</span> <span class="mi">1</span>
<span class="c1">#if the element is less than current element, update min value</span>
<span class="k">if</span> <span class="n">a</span><span class="p">[</span><span class="n">j</span><span class="p">]</span> <span class="o"><</span> <span class="n">a</span><span class="p">[</span><span class="n">index_minimum</span><span class="p">]</span>
<span class="n">index_minimum</span> <span class="o">=</span> <span class="n">j</span>
<span class="c1">#swapping the elements</span>
<span class="n">temp</span> <span class="o">=</span> <span class="n">a</span><span class="p">[</span><span class="n">i</span><span class="p">]</span>
<span class="n">a</span><span class="p">[</span><span class="n">i</span><span class="p">]</span> <span class="o">=</span> <span class="n">a</span><span class="p">[</span><span class="n">index_minimum</span><span class="p">]</span>
<span class="n">a</span><span class="p">[</span><span class="n">index_minimum</span><span class="p">]</span> <span class="o">=</span> <span class="n">temp</span></code></pre></figure>
<h4 id="3-ruby-implementation">3. Ruby implementation</h4>
<p>Now we can convert our language independent pseudocode to Ruby. One thing that stands out in this example, is the use of for loops. In general, using for loops is <a href="https://github.com/bbatsov/ruby-style-guide#no-for-loops">discouraged in Ruby</a>. So instead of using a for loop, we can use <code class="highlighter-rouge">each</code> to iterate over a range, as you can see in the example below.</p>
<figure class="highlight"><pre><code class="language-ruby" data-lang="ruby"><span class="c1">#require "byebug"</span>
<span class="k">def</span> <span class="nf">selection_sort</span><span class="p">(</span><span class="n">a</span><span class="p">,</span> <span class="n">n</span><span class="p">)</span>
<span class="p">(</span><span class="mi">0</span><span class="o">..</span><span class="n">n</span> <span class="o">-</span> <span class="mi">2</span><span class="p">).</span><span class="nf">each</span> <span class="k">do</span> <span class="o">|</span><span class="n">i</span><span class="o">|</span>
<span class="n">i_min</span> <span class="o">=</span> <span class="n">i</span>
<span class="p">(</span><span class="n">i</span> <span class="o">+</span> <span class="mi">1</span><span class="o">..</span><span class="n">n</span> <span class="o">-</span> <span class="mi">1</span><span class="p">).</span><span class="nf">each</span> <span class="k">do</span> <span class="o">|</span><span class="n">j</span><span class="o">|</span>
<span class="k">if</span> <span class="n">a</span><span class="p">[</span><span class="n">j</span><span class="p">]</span> <span class="o"><</span> <span class="n">a</span><span class="p">[</span><span class="n">i_min</span><span class="p">]</span>
<span class="n">i_min</span> <span class="o">=</span> <span class="n">j</span>
<span class="k">end</span>
<span class="k">end</span>
<span class="n">temp</span> <span class="o">=</span> <span class="n">a</span><span class="p">[</span><span class="n">i</span><span class="p">]</span>
<span class="n">a</span><span class="p">[</span><span class="n">i</span><span class="p">]</span> <span class="o">=</span> <span class="n">a</span><span class="p">[</span><span class="n">i_min</span><span class="p">]</span>
<span class="n">a</span><span class="p">[</span><span class="n">i_min</span><span class="p">]</span> <span class="o">=</span> <span class="n">temp</span>
<span class="c1">#byebug</span>
<span class="k">end</span>
<span class="n">a</span>
<span class="k">end</span>
<span class="n">a</span> <span class="o">=</span> <span class="p">[</span><span class="mi">2</span><span class="p">,</span><span class="mi">7</span><span class="p">,</span><span class="mi">4</span><span class="p">,</span><span class="mi">1</span><span class="p">,</span><span class="mi">5</span><span class="p">,</span><span class="mi">3</span><span class="p">]</span>
<span class="n">selection_sort</span><span class="p">(</span><span class="n">a</span><span class="p">,</span> <span class="n">a</span><span class="p">.</span><span class="nf">size</span><span class="p">)</span></code></pre></figure>
<p>*extra: Use <a href="https://github.com/deivid-rodriguez/byebug">Byebug</a> to halt the loop and check the state of your array (a). Use ‘continue’ to run the next iteration. Drop your byebug statement after the swapping of the elements.</p>
<h2 id="binary-search">Binary search</h2>
<p>So we’ve learned our first sorting algorithm. Up next is one of the most well known and fundamental search algorithms: Binary Search</p>
<h4 id="1-problem-description-input-and-output-1">1. Problem description, input and output.</h4>
<p>Once again, we head over to Mycodeschool on Youtube. Try to understand every step, and pause the video if you need to.</p>
<p><a href="http://www.youtube.com/watch?feature=player_embedded&v=j5uXyPJ0Pew&list=PL2_aWCzGMAwL3ldWlrii6YeLszojgH77j " target="_blank"><img src="https://img.youtube.com/vi/j5uXyPJ0Pew/0.jpg" alt="IMAGE ALT TEXT HERE" width="240" height="180" border="5" /></a></p>
<p>Based on the explanation, we can define the problem:</p>
<p>Given a sorted array ( a precondition for Binary search), find value x by reducing the search space by half with each comparison. To do this, we need to keep track of our search space. The algorithm should return the index if the value is found, or “not found”.</p>
<h4 id="2-breaking-down-the-problem-with-pseudocode-1">2. Breaking down the problem with pseudocode</h4>
<figure class="highlight"><pre><code class="language-ruby" data-lang="ruby"><span class="c1">#search function with a sorted array, size of the array, and value to find</span>
<span class="n">binary_search</span><span class="p">(</span><span class="n">array</span><span class="p">,</span> <span class="n">n</span><span class="p">,</span> <span class="n">x</span><span class="p">)</span>
<span class="c1"># set the search space to the entire array</span>
<span class="n">index_start</span> <span class="o">=</span> <span class="mi">0</span>
<span class="n">index_end</span> <span class="o">=</span> <span class="n">n</span> <span class="o">-</span> <span class="mi">1</span>
<span class="c1"># loop until we can't divide the array any further</span>
<span class="k">while</span> <span class="n">index_start</span> <span class="o"><=</span> <span class="n">index_end</span>
<span class="c1"># find the middle element in our search space</span>
<span class="n">middle</span> <span class="o">=</span> <span class="p">(</span> <span class="n">index_start</span> <span class="o">+</span> <span class="n">index_end</span> <span class="p">)</span> <span class="o">/</span> <span class="mi">2</span>
<span class="c1"># 3 possible cases in searching</span>
<span class="c1"># x matches the middle element</span>
<span class="k">if</span> <span class="n">array</span><span class="p">[</span><span class="n">middle</span><span class="p">]</span> <span class="o">==</span> <span class="n">x</span>
<span class="k">return</span> <span class="n">middle</span>
<span class="c1">#x is less then middle element - > reduce search space / 2</span>
<span class="k">elsif</span> <span class="n">x</span> <span class="o"><</span> <span class="n">array</span><span class="p">[</span><span class="n">middle</span><span class="p">]</span>
<span class="n">index_end</span> <span class="o">=</span> <span class="n">middle</span> <span class="o">-</span> <span class="mi">1</span>
<span class="k">else</span>
<span class="c1"># x is larger than middle element -> reduce search space / 2</span>
<span class="n">index_start</span> <span class="o">=</span> <span class="n">middle</span> <span class="o">+</span> <span class="mi">1</span>
<span class="c1"># return not found if our loop is completed and we've found no matching element</span>
<span class="k">return</span> <span class="o">-</span><span class="mi">1</span> <span class="p">(</span><span class="n">not</span> <span class="n">found</span><span class="p">)</span></code></pre></figure>
<h4 id="3-ruby-implementation-1">3. Ruby implementation</h4>
<p>If your pseudocode is understandable, converting it to Ruby should not be difficult. See the example implementation below. A note here is that return statements are not necessary in Ruby, but I’ve left them in for some extra readability.</p>
<figure class="highlight"><pre><code class="language-ruby" data-lang="ruby"> <span class="k">def</span> <span class="nf">binary_search</span><span class="p">(</span><span class="n">a</span><span class="p">,</span> <span class="n">n</span><span class="p">,</span> <span class="n">x</span><span class="p">)</span>
<span class="n">index_start</span> <span class="o">=</span> <span class="mi">0</span>
<span class="n">index_end</span> <span class="o">=</span> <span class="n">n</span> <span class="o">-</span> <span class="mi">1</span>
<span class="k">while</span> <span class="n">index_start</span> <span class="o"><=</span> <span class="n">index_end</span> <span class="k">do</span>
<span class="n">index_middle</span> <span class="o">=</span> <span class="p">(</span><span class="n">index_start</span> <span class="o">+</span> <span class="n">index_end</span><span class="p">)</span> <span class="o">/</span> <span class="mi">2</span>
<span class="k">if</span> <span class="n">a</span><span class="p">[</span><span class="n">index_middle</span><span class="p">]</span> <span class="o">==</span> <span class="n">x</span>
<span class="k">return</span> <span class="n">index_middle</span>
<span class="k">elsif</span> <span class="n">x</span> <span class="o"><</span> <span class="n">a</span><span class="p">[</span><span class="n">index_middle</span><span class="p">]</span>
<span class="n">index_end</span> <span class="o">=</span> <span class="n">index_middle</span> <span class="o">-</span> <span class="mi">1</span>
<span class="k">else</span>
<span class="n">index_start</span> <span class="o">=</span> <span class="n">index_middle</span> <span class="o">+</span> <span class="mi">1</span>
<span class="k">end</span>
<span class="k">end</span>
<span class="k">return</span> <span class="s2">"not found"</span>
<span class="k">end</span>
<span class="n">a</span> <span class="o">=</span> <span class="p">[</span><span class="mi">1</span><span class="p">,</span> <span class="mi">2</span><span class="p">,</span> <span class="mi">3</span><span class="p">,</span> <span class="mi">4</span><span class="p">,</span> <span class="mi">5</span><span class="p">,</span> <span class="mi">7</span><span class="p">]</span>
<span class="nb">puts</span> <span class="n">binary_search</span><span class="p">(</span><span class="n">a</span><span class="p">,</span> <span class="n">a</span><span class="p">.</span><span class="nf">size</span><span class="p">,</span> <span class="mi">3</span><span class="p">)</span></code></pre></figure>
<h2 id="bubble-sort">Bubble sort</h2>
<p>As an exercise, read a <a href="https://en.wikipedia.org/wiki/Bubble_sort">high level description</a> of Bubble Sort and try to implement the algorithm yourself, before watching the step by step explanation. Remember to use the steps in the approach to solving a problem.</p>
<h2 id="what-else-do-you-need-to-know">What else do you need to know?</h2>
<p>Before you continue, getting a basic understanding of the following topics is recommended.</p>
<h3 id="asymptotic-notation">Asymptotic notation</h3>
<p>When working with algorithms, efficiency is everything.
If you’ve watched the video, you’ve heard about a “time complexity analysis” and something called O(n2). This is called “Asymptotic notation” and “Big O notation”. In a nutshell, asymptotic notation is a method to describe your algorithms performance, based on the amount of data it’s processing. Check out the resources below to learn more;</p>
<ul>
<li>
<p><a href="http://blog.honeybadger.io/a-rubyist-s-guide-to-big-o-notation/">A Rubyists guide to big o notation</a></p>
</li>
<li>
<p><a href="https://www.khanacademy.org/computing/computer-science/algorithms/asymptotic-notation/a/asymptotic-notation">Asymptotic notation on Khan Academy</a></p>
</li>
</ul>
<h3 id="data-structures">Data structures</h3>
<p>Essential for not just algorithms but software development in general, is the concept of data structures. Data structures are objects that organize data with various operations that can be used to retrieve or manipulate the data stored within them. We’ve used a concrete implementation of a data structure (arrays) in our algorithm examples.</p>
<p>Data structures is a large topic, but for now, what you need to know is that the type of data structure and the way you retrieve and/or manipulate data directly affects your algorithms performance. It’s recommended to get a basic understanding of different common data structures, their performance in certain situations and the differences between them.</p>
<p>Examples of common data structures are stacks, queues, linked list, tree and graph. Mycodeschool has a great series on data structures that you can watch here:</p>
<ul>
<li><a href="https://www.youtube.com/watch?v=92S4zgXN17o&list=PL2_aWCzGMAwI3W_JlcBbtYTwiQSsOTa6P">Introduction to data structures</a></li>
</ul>
<p><br /></p>
<p>That’s it for this very small introduction to algorithms! Hopefully, this provides a starting point for further exploration and practice.
If you have any questions or feedback, let me know in the comments.</p>
<h3 id="sources">Sources</h3>
<ul>
<li>
<p><a href="https://www.youtube.com/channel/UClEEsT7DkdVO_fkrBw0OTrA">Mycodeschool Youtube channel</a></p>
</li>
<li>
<p>Examples of common algorithms implemented in Ruby: <a href="https://github.com/kanwei/algorithms">Google Summer of code, Ruby Algorithms</a></p>
</li>
<li>
<p><a href="https://www.khanacademy.org/computing/computer-science/algorithms">Khan Academy course on algorithms</a></p>
</li>
<li>
<p><a href="https://github.com/Developer-Y/cs-video-courses/blob/master/README.md">A huge list of CS resources, useful if you know what your looking for</a></p>
</li>
</ul>There are many reasons why you should learn about algorithms. They are a fundamental part of computer science and they play a key role in modern technological innovation. Besides that, learning basic algorithms is a good way to develop your problem solving skills and to better understand the software and programming languages you work with on a daily basis.Speech to text with the Google cloud speech api and Ruby2016-11-16T10:10:01+01:002016-11-16T10:10:01+01:00http://localhost:4000/2016/11/16/speech-to-text-with-the-google-cloud-speech-api-and-ruby<p>This guide is a quick overview on how to setup speech to text conversion with the Google cloud speech API in your Ruby application. <a href="https://cloud.google.com/speech/">The Google cloud speech</a> API provides speech recognition for over 80 languages, powered by machine learning.</p>
<p>First, let’s head over to the <a href="https://cloud.google.com/">Google cloud platform</a> to create an account. After creating your account, start with creating your first project. To use the Cloud speech API, activate the api from your api management dashboard.</p>
<p>Finally, after activating the api, go to you api management dashboard and select credentials. Here you can create an api key that we will use for authentication.</p>
<h2 id="setting-up-the-google-api-client">Setting up the Google api client</h2>
<p>In this example, we will use a basic ruby project/gem. Feel free to do this in Rails or any other framework you prefer.</p>
<p><code class="highlighter-rouge">bundle gem my_project</code></p>
<p>To access the Google cloud speech api, we will use the <a href="https://github.com/google/google-api-ruby-client">Google api ruby client</a>.</p>
<p>Add the gem to your gemfile</p>
<p><code class="highlighter-rouge">gem 'google-api-client', '~> 0.9'</code></p>
<p>The api client we want to use is <a href="https://github.com/google/google-api-ruby-client/tree/master/generated/google/apis/speech_v1beta1">speech_v1beta1.</a> In <a href="https://github.com/google/google-api-ruby-client/blob/master/generated/google/apis/speech_v1beta1/service.rb">service.rb</a>, a few methods are provided that allow use to work with the speech recognition api. To perform the audio processing, the ruby api client provides us two options;</p>
<ul>
<li>synchronous, with the <code class="highlighter-rouge">sync_recognize_speech</code> method. This results in receiving the results after all audio has been sent and processed.</li>
<li>asynchronous, with the <code class="highlighter-rouge">async_recognize_speech method</code>. This allows the audio to be processed asynchronous, returning an ‘operation’ with the operation status and/or results.
This is the method that we will use in our example.</li>
</ul>
<h2 id="processing-our-audio-file">Processing our audio file</h2>
<p>In the Transcriber module, add the following two methods;</p>
<ul>
<li>async_request: This will perform the request sending a request object with our audio file and configuration.</li>
<li>get operation: This will retrieve the operation with the status of our request process, and results if the operation is finished.</li>
</ul>
<p>In the example, we use the Google audio cloud sample (brooklyn.flac). If you want you can change this to use the content attribute instead with a path to your local .flac file.</p>
<p><code class="highlighter-rouge">lib/transcriber.rb</code></p>
<figure class="highlight"><pre><code class="language-ruby" data-lang="ruby"><span class="nb">require</span> <span class="s1">'google/apis/speech_v1beta1'</span>
<span class="k">module</span> <span class="nn">Transcriber</span>
<span class="k">def</span> <span class="nc">self</span><span class="o">.</span><span class="nf">async_request</span>
<span class="n">speech</span> <span class="o">=</span> <span class="no">Google</span><span class="o">::</span><span class="no">Apis</span><span class="o">::</span><span class="no">SpeechV1beta1</span><span class="o">::</span><span class="no">SpeechService</span><span class="p">.</span><span class="nf">new</span>
<span class="n">speech</span><span class="p">.</span><span class="nf">key</span> <span class="o">=</span> <span class="s1">'YOUR_API_KEY'</span>
<span class="n">async_recognize_request_object</span> <span class="o">=</span> <span class="no">Google</span><span class="o">::</span><span class="no">Apis</span><span class="o">::</span><span class="no">SpeechV1beta1</span><span class="o">::</span><span class="no">AsyncRecognizeRequest</span><span class="p">.</span><span class="nf">new</span>
<span class="n">async_recognize_request_object</span><span class="p">.</span><span class="nf">config</span> <span class="o">=</span> <span class="p">{</span>
<span class="ss">encoding: </span><span class="s2">"FLAC"</span><span class="p">,</span>
<span class="ss">sample_rate: </span><span class="mi">16000</span><span class="p">,</span>
<span class="ss">language_code: </span><span class="s2">"en-US"</span>
<span class="p">}</span>
<span class="n">async_recognize_request_object</span><span class="p">.</span><span class="nf">audio</span> <span class="o">=</span> <span class="p">{</span>
<span class="c1"># content: path_to_audio</span>
<span class="n">uri</span><span class="ss">:'gs://cloud-samples-tests/speech/brooklyn.flac'</span>
<span class="p">}</span>
<span class="n">speech</span><span class="p">.</span><span class="nf">async_recognize_speech</span><span class="p">(</span><span class="n">async_recognize_request_object</span><span class="p">)</span>
<span class="k">end</span>
<span class="k">def</span> <span class="nc">self</span><span class="o">.</span><span class="nf">get_operation</span><span class="p">(</span><span class="n">operation_name</span><span class="p">)</span>
<span class="n">speech</span> <span class="o">=</span> <span class="no">Google</span><span class="o">::</span><span class="no">Apis</span><span class="o">::</span><span class="no">SpeechV1beta1</span><span class="o">::</span><span class="no">SpeechService</span><span class="p">.</span><span class="nf">new</span>
<span class="n">speech</span><span class="p">.</span><span class="nf">key</span> <span class="o">=</span> <span class="s1">'YOUR_API_KEY'</span>
<span class="n">speech</span><span class="p">.</span><span class="nf">get_operation</span><span class="p">(</span><span class="n">operation_name</span><span class="p">)</span>
<span class="k">end</span>
<span class="k">end</span></code></pre></figure>
<h2 id="using-the-module">Using the module</h2>
<p>Run irb in lib/transcriber followed by <code class="highlighter-rouge">require_relative "transcriber"</code>
to load your module. Or use the console of you’re using Rails.</p>
<p>To make your first request:</p>
<p><code class="highlighter-rouge">request = Transcriber::async_request</code></p>
<p>This will respond with an ‘operation’ and a <code class="highlighter-rouge">name</code> attribute. The name is what we will use to identify our operation and to retrieve the results.</p>
<p>Depending on the length of your file, it can take some time to process the audio. Use the get_operation method with the operation name to retrieve the operation status.</p>
<p><code class="highlighter-rouge">operation = Transcriber::get_operation(request.name)</code></p>
<p>If the operation is finished, check out the transcript from your request with</p>
<p><code class="highlighter-rouge">operation.response["results"]</code></p>
<p>For more information about the Google cloud speech api;</p>
<p><a href="https://cloud.google.com/speech/docs/getting-started">Cloud speech api - Getting started</a></p>This guide is a quick overview on how to setup speech to text conversion with the Google cloud speech API in your Ruby application. The Google cloud speech API provides speech recognition for over 80 languages, powered by machine learning.