Today I was asked to convert an XML file full of e-mail addresses into JSON. I thought this solution might help others learn how to create an XML, parse it, and covert it over to JSON.
This tutorial requires you to create four files:
my_app/Gemfile
for managing third-party libraries.my_app/email_builder.rb
will generate XML data using thefaker
library.my_app/email_parser.rb
will parse the XML data and convert it to JSON.my_app/lib/file_manager.rb
for creating, loading and saving data to files.
If you prefer to download an example application, its availabel on Github.
Step 0 - Getting Started
Create two folders:
- Project folder named
my_app
. - Lib folder
name_of_my_ruby_app/lib
cd ~/Desktop
mkdir my_app
mkdir my_app/lib
Step 1 - Create Gemfile
Create my_app/Gemfile
and paste this.
# frozen_string_literal: true
source "https://rubygems.org"
git_source(:github) {|repo_name| "https://github.com/#{repo_name}" }
gem 'faker', :git => 'https://github.com/faker-ruby/faker.git', :branch => 'master'
gem "populator", '~> 1.0.0'
gem "excelsior", '~> 0.1.0'
gem "nokogiri", '~> 1.8.0'
After pasting the libraries, install them.
bundle install
Step 2 - Create email_builder.rb
Create my_app/email_builder.rb
and paste this.
#!/usr/bin/env ruby
require 'date'
require 'faker'
require 'nokogiri'
require File.join(File.dirname(__FILE__), 'lib', 'file_manager')
class EmailUtil
include FileManager
NUM_OF_EMAILS = 5000
FILE_PATH = "content-#{Date.today.to_s}.xml"
def initialize
xml = create_xml
save(xml, FILE_PATH)
end
# http://stackoverflow.com/a/27065613
def create_xml
builder = Nokogiri::XML::Builder.new(encoding: 'UTF-8') do |xml|
xml.data {
NUM_OF_EMAILS.times do
first_name = Faker::Name.first_name
last_name = Faker::Name.last_name
xml.option(
# This represents a message data tag with an optional full name
%{#{Faker::Internet.email} (#{['', first_name + " " + last_name].sample})},
first_name: ['', first_name].sample,
last_name: ['', last_name].sample,
zip_code: ['', Faker::Address.zip_code].sample,
gender: ["m", "f", "o"].sample,
dob: Faker::Date.between(from: Date.parse("1st Jan 1920"), to: Date.parse("1st Jan #{min_age_requirement}")),
phone_mobile: ['', Faker::PhoneNumber.cell_phone].sample,
phone_other: ['', Faker::PhoneNumber.phone_number].sample
)
end
}
end
builder.to_xml
end
private
def min_age_requirement
this_year = Time.now.year
min_age = 13
this_year - min_age
end
end
EmailUtil.new
Step 3 - Create email_parser.rb
Create my_app/email_parser.rb
and paste this.
#!/usr/bin/env ruby
require 'date'
require 'json'
require 'active_support/json'
require 'rubygems'
require "rexml/document"
require File.join(File.dirname(__FILE__), 'lib', 'file_manager')
class List
include FileManager
FILE_XML = "content-#{Date.today.to_s}.xml"
FILE_JSON = "content-#{Date.today.to_s}.json"
def initialize(input=FILE_XML)
if !input.empty?
puts FILE_XML
file = load(input)
json = parse(file)
save(json, FILE_JSON)
elsif ARGV.empty?
puts "Please add an XML filepath"
puts "For example: ruby init.rb './path/to/file.xml'"
exit
else
ARGV.each_with_index do|a, idx|
if idx == 0
load(a)
end
end
end
end
def parse(file)
#Create a new Rolodex
contacts = Array.new
#Convert the file to become XML-ready
doc = REXML::Document.new(file)
#Iterate through each node
doc.elements.each_with_index("data/option") { |e, idx|
my_text = e.text
#Capture the email before "("
before_char = my_text[/[^(]+/]
#Capture the text after "("
after_char = my_text[/\(.*/m]
arr = my_text.split("(")
email = arr[0].strip!
name = arr[1][/[^)]+/] ? arr[1][/[^)]+/].strip : ""
contacts.push({
"email": email,
"full_name": name,
"first_name": e.attributes["first_name"],
"last_name": e.attributes["last_name"],
"zip_code": e.attributes["zip_code"],
"gender": e.attributes["gender"],
"dob": e.attributes["dob"],
"phone_mobile": e.attributes["phone_mobile"],
"phone_other": e.attributes["phone_other"]
})
}
# https://www.rubydoc.info/docs/rails/4.1.7/ActiveSupport/JSON/Encoding#json_encoder-class_method
json = ActiveSupport::JSON.encode(contacts)
end
end
List.new
Step 4 - Create file_manager.rb
module
Create my_app/lib/file_manager.rb
and paste this.
#!/usr/bin/env ruby
require 'fileutils'
module FileManager
APP_ROOT = File.dirname(__FILE__)
OUTPUT_DIR = "output"
def destroy_dir
puts "destroy_dir"
FileUtils.rm_rf( OUTPUT_DIR )
end
def create_dir
Dir.mkdir( OUTPUT_DIR )
#Make it platform independent
$:.unshift( File.join(APP_ROOT, OUTPUT_DIR ) )
end
def create_file(file_path)
File.join(OUTPUT_DIR, file_path)
end
def load(file)
File.open(file)
end
def save(data, path)
# Create a File
output = File.new(path, "w")
# Save data to File
output.puts data
end
end
Step 5 - Let's Build!
Create an XML file using the faker
library.
ruby email_builder.rb
Parse the XML file and create a JSON file.
ruby email_parser.rb
Subscribe to new posts
Processing your application
Please check your inbox and click the link to confirm your subscription
There was an error sending the email