Posts Tagged ‘ruby’

Ruby on Rails Diff Text to HTML <ins> and <del>

Tuesday, November 4th, 2008

This code is perfect if you have 2 text objects in your Rails application and you want to compare their differences in one of your HTML views. It’s 99% pure Ruby too, so if you alter the first line, you can use it for other purposes.

Only one thing to note: you must have diff installed. I’m using: diff (GNU diffutils) 2.8.1.

  1. #set up some variables to reference later
  2. temporary_directory = File.join(Rails.root, "tmp")
  3. max_lines = 9999999 #needs to be larger than the most lines you'll consider
  4. diff_header_length = 3
  5.  
  6. # text_old and text_new should be the values of the string objects to compare
  7. # these are just example strings to show it works
  8. text_old      = "line1\ndeleted line2\nline3\n\nline4\nline5"
  9. text_new      = "line1\ninserted line2\nline3\n\nline4\nline5"
  10.  
  11. # since we're using diff on the file system, we'll save the text we want to compare
  12. # and then run diff against the two files
  13. file_old_name = File.join(temporary_directory,"file_old"+rand(1000000).to_s)
  14. file_new_name = File.join(temporary_directory,"file_new"+rand(1000000).to_s)
  15. file_old      = File.new(file_old_name, "w+")
  16. file_new      = File.new(file_new_name, "w+")
  17. file_old.write(text_old+"\n")
  18. file_new.write(text_new+"\n")
  19. file_old.close
  20. file_new.close
  21.  
  22. # diff will give provide a string showing insertions and deletions.  We will
  23. # split this string out by newlines if there are difference, and mark it up
  24. # accordingly with html
  25. lines = %x(diff -­-­­­­­­unified=#{max_lines} #{file_old_name} #{file_new_name})
  26. if lines.empty?
  27.   lines = text_new.split(/\n/)
  28. else
  29.   lines = lines.split(/\n/)[diff_header_length..max_lines].
  30.   collect do |i|
  31.     if i.empty?  
  32.       ""
  33.     else
  34.       case i[0,1]
  35.       when "+"; then "<ins>"+i[1..i.length-1]+"</ins>"
  36.       when "-"; then "<del>"+i[1..i.length-1]+"</del>"
  37.       else; i[1..i.length-1]
  38.       end
  39.     end
  40.   end
  41. end
  42.  
  43. #clean up the temporary diff files we created
  44. File.delete(file_new_name)
  45. File.delete(file_old_name)
  46.  
  47. #return marked up text
  48. lines.join("\n")

If you fire up RAILS_ROOT/script/console and paste that code in, it will return a nicely marked up string like this:

  1. line1
  2. <del>deleted line2</del>
  3. <ins>inserted line2</ins>
  4. line3
  5.  
  6. line4
  7. line5

Use CSS to make your ins and del tags render however you like.

Ruby Script to Search Apache Logs for High Frequency Clients

Wednesday, October 29th, 2008

I wrote a quick Ruby script to scour through my Apache access logs and look for IPs that are hitting my site too frequently, e.g., bad bots, etc. The command line arguments are simple:

$ ruby find-frequent-clients.rb \
--apache-access-log=/path/to/your/log \
--seconds=3600 \
--request-limit=7200 \
--log-time-zone=PST

That command is going to find any client IPs that are hitting my web server in the last 10 minutes more twice or more per second. The output will be a line separated list of IP addressess (optionally with a hit count if --show-count=1 is added). Here’s how it works:

File: find-frequent-clients.rb
  1. require 'date'
  2. require 'time'
  3. # Process command line arguments.  Filter only args starting with –
  4. args = {}
  5. $*.each do |arg|
  6.   spl=arg.split("=")
  7.   if spl[0][0..1] == "–"
  8.     args[spl[0][2..spl[0].length-1].gsub("-","_").intern]=spl[1]
  9.   end
  10. end
  11.  
  12. # Check that we have the bare essentials to proceed
  13. raise "You must specify the full path to an Apache access log file with –apache-access-log" unless args[:apache_access_log]
  14. raise "You must specify the maximum amount of recent seconds to consider with –seconds" unless args[:seconds]
  15. raise "You must specify the maximum requests allowed per #{args[:seconds]} seconds with –request-limit" unless args[:request_limit]
  16. raise "You must specify the time zone of the Apache logs with –log-time-zone e.g., EST" unless args[:log_time_zone]
  17. raise "The Apache access log file specified does not exist or is not readable: #{args[:apache_access_log]}" unless FileTest.readable?(args[:apache_access_log])
  18.  
  19. # Open the file and read the lines in reverse; exit once time stamps are beyond our time threshold
  20. file = File.open(args[:apache_access_log], "r")
  21. log_array = []
  22. log_snapshot = file.readlines
  23. file.close
  24. start_time = Time.now.to_i
  25. log_snapshot.reverse_each do |line|
  26.   line_array = line.split(" ")
  27.   date_time = line_array[3][1..line_array[3].length-1]
  28.   date_time[11] = " "
  29.   date_time = Time.iso8601(DateTime.parse(date_time+" "+args[:log_time_zone]).to_s).to_i
  30.   if date_time &gt; (start_time - args[:seconds].to_i)
  31.     log_array &lt;&lt; [line_array[0], date_time]
  32.   else
  33.     break
  34.   end
  35. end
  36.  
  37. # Use a hash to collect the counts of the IPs
  38. log_hash = Hash.new(0)
  39. log_array.each do |log|
  40.   log_hash[log[0]]+=1
  41. end
  42.  
  43. # collect the offenders in an array
  44. offenders = log_hash.to_a.collect{|h| h if h[1] &gt; args[:request_limit].to_i}.compact
  45.  
  46. # output the offending IPs, 1 per line; optionally show the offending count
  47. offenders.each{|o| puts o[0].to_s+"#{" => "+o[1].to_s if args[:show_count]}"}

Note: This makes the assumption that your logs are in the format: aa.bb.cc.dd - - [datetime]

Skip All Rails Filters

Tuesday, October 14th, 2008

It took me a while to figure this out, but it’s quite simple.  If you want to skip all of the filters a Rails controller will run, simply put the following at the top of your controller:

  1. skip_filter filter_chain #both documented in the Rails API

For example, if your application controller defines a filter to check if a user is logged in, it makes sense that this filter might run for all controllers, except in rare cases.  In my case, I have a dynamic image controller that doesn’t require all of the overhead that most controllers do.  For that controller, I use the above to skip all of the filters.

Ruby On Rails RSS Reader

Saturday, July 12th, 2008

We moved our Athlo blog to a WordPress app to separate it completely from the main app. One interaction I wanted between the two though was that I wanted the most recent blog entries to show on the Athlo site. I thought that RSS would offer an easy solution so I started looking around to find out if I’d need a Rails plugin or something like that.

The solution was far simpler. And pure Ruby (man I love this language!).

  1. require 'rss'
  2. rss = RSS::Parser.parse(open('http://blog.athlo.com/feed/').read, false).items[0..MaxRSSItems-1]

That’s it. That simple call supplies you with a full array of all the items from the RSS feed. In my specific example, I’ve used a range to limit the results to the value of MaxRSSItems.

No plugins required. No Rails required. Ruby RSS will do what you need to read feeds. (That should be in a poem.)