Skip to main content

Ruby: A Python Programmer's Perspective Part III

This is a somewhat random list of things that were interesting or surprising to me when I read Ruby for Rails. The previous post in the series is Ruby: A Python Programmer's Perspective Part II.

I've mentioned before that Ruby has symbols which are similar to, but distinct from, strings (which are mutable). The syntax is :foo. To create a symbol with spaces, use :"foo bar".

Ruby has two syntax for ranges. ".." includes both endpoints. "..." does not include the right endpoint and thus behaves like slices in Python do:
>> (0...2).each {|x| p x}
0
1
=> 0...2

>> (0..2).each {|x| p x}
0
1
2
=> 0..2
Since the "+" operator is just syntactic sugar for the "+" method, the following are equivalent:
>> 1 + 1
=> 2
>> 1.+(1)
=> 2
Whenever possible, I think interpreters should raise an exception if you misspell something. Hence, I found the following interesting:
>> def f
>> p @undefed_instance_variable # This prints nil.
>> p undefined_local # This raises a NameError.
>> end
=> nil
>> f
nil
NameError: undefined local variable or method `undefined_local' for main:Object
from (irb):39:in `f'
from (irb):41
nil and false are the only two objects that evaluate to false:
>> if []: puts "Still true"; end
Still true
=> nil
In Python, "truthiness" is a lot more complex. For instance, [1] is true, whereas [] is false, and in general, classes can define their own notion of truthiness.

Ruby has "==", "===", ".eql", and ".equal". "===" is for case statements. "==" is related to ">=", etc. ".eql" tests that the two objects have the "same content". Usually, "==" and ".eql" return the same thing, but their implementation is different. ".equal" is reserved for object identity. Hence:
>> "a" == "a"
=> true
>> "a".eql?("a")
=> true
>> "a".equal?("a")
=> false
>> "a" === "a"
=> true
If you want to see all of a class's instance methods, use the "instance_methods" method. If you don't want to see inherited methods, pass false to "instance_methods":
>> class C
>> def f
>> end
>> end
=> nil
>>
?> C.instance_methods
=> ["inspect", "f", "clone", "method", "public_methods",
"instance_variable_defined?", "equal?", "freeze", "methods", "respond_to?",
"dup", "instance_variables", "__id__", "object_id", "eql?", "require", "id",
"singleton_methods", "send", "taint", "frozen?", "instance_variable_get",
"__send__", "instance_of?", "to_a", "type", "protected_methods",
"instance_eval", "==", "display", "===", "instance_variable_set", "kind_of?",
"extend", "to_s", "hash", "class", "tainted?", "=~", "private_methods", "gem",
"nil?", "untaint", "is_a?"]
>>
?> C.instance_methods(false)
=> ["f"]
Using Class.instance_methods.sort is helpful too.

When you are appending one string onto the end of another string, using += does not modify the original string, whereas << does:
>> s = "foo"
=> "foo"
>> t = s
=> "foo"
>> s += "bar"
=> "foobar"
>> s
=> "foobar"
>> t
=> "foo"
>> u = t
=> "foo"
>> u << "bar"
=> "foobar"
>> u
=> "foobar"
>> t
=> "foobar"
Indexing a string returns that character's ASCII value, which is a bit surprising. Slicing does what you would expect:
=> "abc"
>> s[1]
=> 98
>> s[1,1]
=> "b"
However, indexing a string to set a character works as expected:
>> s[1] = 'B'
=> "B"
>> s
=> "aBc"
Arrays have a "concat" method. "concat" and "+" operate as you might expect. "concat" is conceptually "concat!" because it modifies the array, but it's spelled without the "!":
>> s = [1, 2]
=> [1, 2]
>> t = [3, 4]
=> [3, 4]
>> s + t
=> [1, 2, 3, 4]
>> s
=> [1, 2]
>> s.concat(t)
=> [1, 2, 3, 4]
>> s
=> [1, 2, 3, 4]
>> t
=> [3, 4]
To get both the values and indexes of an array as you loop over it, use the each_with_index method:
>> ['a', 'b', 'c'].each_with_index {|x, i| puts "#{i} => #{x}"}
0 => a
1 => b
2 => c
There are a couple ways to create a hash. The second syntax is a bit weird to my eyes:
>> {:a => :b}
=> {:a=>:b}

>> Hash[:a => :b]
=> {:a=>:b}
By default, you'll get nil if you try to lookup a key that doesn't exist in a hash. As I've mentioned before, I prefer to get exceptions by default if I misspell things. Fortunately, the fetch method exists:
>> h = {:a => :b}
=> {:a=>:b}
>> h[:c]
=> nil
>> h.fetch(:c)
IndexError: key not found
from (irb):50:in `fetch'
from (irb):50
The default nil behavior of Ruby hashes is particular frustrating to me since Ruby uses hashes for keyword arguments. It's really easy for a misspelled option to lead to a silent bug:
>> def f(arg, options)
>> p arg
>> if options[:excited]
>> puts "Oh, yeah!" # This doesn't get printed.
>> end
>> end
=> nil
>>
?> f("hello", :excitid => true) # Oops, typo.
"hello"
=> nil
I think real support for keyword arguments is better. Here's the Python:
>>> def f(arg, excited=False):
... print arg
... if excited:
... print "Yeah!"
...
>>> f("hello", excitid=True)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: f() got an unexpected keyword argument 'excitid'
You can still use the Ruby approach in Python using the "**kargs" syntax, but it's not the default.

I was a little surprised that you could connect two statements with a semicolon inside parenthesis:
>> (puts 'hi'; puts 'there') unless 100 < 10
hi
there
=> nil
Iterating over a string yields lines, not characters like it does in Python:
>> "foo\nbar".each {|line| puts line}
foo
bar
=> "foo\nbar"
When using regular expressions, the =~ operator returns the index of the match, whereas the #match method returns a MatchData object:
>> /o/ =~ 'foo'
=> 1
>> /o/.match('foo')
=> #<MatchData:0x35d5cc>
Be careful. The MatchData object stores captures starting at 1, whereas the #captures method stores captures starting at 0:
>> match = /(o)/.match('foo')
=> #<MatchData:0x357654>
>> match[1]
=> "o"
>> match.captures[0]
=> "o"
See my earlier post concerning "^" vs. "\A" in regular expressions. Ruby and Python are subtly different. Whereas I would normally use "^" in Python, to get the same behavior, I would use "\A" in Ruby.

If you want to treat a string as an array of characters, use '"foobar".split(//)'.

String#scan is a pretty cool method that repeatedly tries to match a regular expression while iterating over a string. Unlike #match, it returns all of the matches, instead of just the first:
>> s = "John Doe was the father of Jane Doe."
=> "John Doe was the father of Jane Doe."
>> s.scan(/([A-Z]\w+)\s+([A-Z]\w+)/)
=> [["John", "Doe"], ["Jane", "Doe"]]
Okay, that's it for this post, but I've got more coming :)

Comments

Philip Jenvey said…
Note that Python's re.findall is analogous to String#scan
jjinux said…
Phil, nice! I didn't know about that.
jjinux said…
The next post in the series is here:

http://jjinux.blogspot.com/2009/02/ruby-python-programmers-perspective_9838.html

Popular posts from this blog

Ubuntu 20.04 on a 2015 15" MacBook Pro

I decided to give Ubuntu 20.04 a try on my 2015 15" MacBook Pro. I didn't actually install it; I just live booted from a USB thumb drive which was enough to try out everything I wanted. In summary, it's not perfect, and issues with my camera would prevent me from switching, but given the right hardware, I think it's a really viable option. The first thing I wanted to try was what would happen if I plugged in a non-HiDPI screen given that my laptop has a HiDPI screen. Without sub-pixel scaling, whatever scale rate I picked for one screen would apply to the other. However, once I turned on sub-pixel scaling, I was able to pick different scale rates for the internal and external displays. That looked ok. I tried plugging in and unplugging multiple times, and it didn't crash. I doubt it'd work with my Thunderbolt display at work, but it worked fine for my HDMI displays at home. I even plugged it into my TV, and it stuck to the 100% scaling I picked for the othe

ERNOS: Erlang Networked Operating System

I've been reading Dreaming in Code lately, and I really like it. If you're not a dreamer, you may safely skip the rest of this post ;) In Chapter 10, "Engineers and Artists", Alan Kay, John Backus, and Jaron Lanier really got me thinking. I've also been thinking a lot about Minix 3 , Erlang , and the original Lisp machine . The ideas are beginning to synthesize into something cohesive--more than just the sum of their parts. Now, I'm sure that many of these ideas have already been envisioned within Tunes.org , LLVM , Microsoft's Singularity project, or in some other place that I haven't managed to discover or fully read, but I'm going to blog them anyway. Rather than wax philosophical, let me just dump out some ideas: Start with Minix 3. It's a new microkernel, and it's meant for real use, unlike the original Minix. "This new OS is extremely small, with the part that runs in kernel mode under 4000 lines of executable code.&quo

Haskell or Erlang?

I've coded in both Erlang and Haskell. Erlang is practical, efficient, and useful. It's got a wonderful niche in the distributed world, and it has some real success stories such as CouchDB and jabber.org. Haskell is elegant and beautiful. It's been successful in various programming language competitions. I have some experience in both, but I'm thinking it's time to really commit to learning one of them on a professional level. They both have good books out now, and it's probably time I read one of those books cover to cover. My question is which? Back in 2000, Perl had established a real niche for systems administration, CGI, and text processing. The syntax wasn't exactly beautiful (unless you're into that sort of thing), but it was popular and mature. Python hadn't really become popular, nor did it really have a strong niche (at least as far as I could see). I went with Python because of its elegance, but since then, I've coded both p