Real Estate REAXML Parser Script that takes a minute to install and you can use it on any website that needs single import files like WordPress

My requirements when developing this REAXML parser script was to make sure it was fast, efficient and not repeat imports if NO changes are detected and be able to use WP-Property to import the files and manage the property and rental listings.

So after countless hours searching for a REA XML parser to no avail I set out to find a developer to build my REA XML parser.

The script had to do the following:

  • Combine the individual REAXML files into a single, or multiple import files.
  • Process the results and store them in 3 new separate REAXML files: Current, SOLD/LEASED, and withdrawn to increase speed.
  • When the property status changes to sold/leased/withdrawn delete the property entry from the current REAXML output file.
  • Just added: Geocode the property address during import and add another field so my new property plugin did not have to also geocode the entry on import.

Problem #1 : The REAXML format is multiple XML files.

The problem with the REAXML import format is that the property entries are saved into separate files that could contain one or more XML elements that are either updated or new.

REAXML File Structure Example:
commaus_2014-01-14_15-22-03_4940.xml
commaus_2014-01-14_21-22-01_4941.xml
commaus_2014-01-14_22-22-01_4942.xml
and they go on forever…

Problem #2 : Importing one huge XML file will slow down your site

Processing one single growing file every day can cause server slowdown especially on smaller shared servers and this is even more of a problem when your clients want the import schedule to run more frequently.

Problem #3 : Repeated entries in several files

I noticed that sometimes a record would come through several times with minor alterations each time, like the agent couldn’t make up their mind on the details of a particular property entry, normal human nature like pressing send on an email and then remembering one more thing. These records needed to be discarded and only the latest one considered for parsing, again for speed.

Solution was to merge the entries into three separate files:

newlyAddedCurrent.xml
newlyAddedSold_Leased.xml
newlyAddedWithdrawn.xml

Why split the files into Current, sold/leased and withdrawn?

The Australian real estate market is unlike the US market where agents and companies don’t have access to entire regions of property, they tend to only have a handful of listings per agent so putting the “current” property into its own file makes the process much faster with a much smaller file.

When the first few files are imported its not much of a problem, but after a couple of years having several hundred records being processed several times per day this could end up slowing or even killing your server, your phone will be ringing with customer complaints, not good.

So with the split files to process the current.xml file may only be 150kb but the sold.xml file is over 1.5mb. Much quicker to process the small current.xml file repeatedly and the sold.xml less frequently.

Why not import directly into the WordPress SQL database?

Importing directly into WordPress was a no go because at the time when WordPress was at version 2.6 it did not have custom post types capabilities like today. Also I did not have my own property plugin nor the expertise at that time to create such a plugin like WP-Property which has a nifty supermap feature which we ended up using to manage the listings on our WordPress theme.

Another reason is I did not want WordPress to continually import repeated data and images that didn’t change as this would blow out the size of your website very quickly.

How did you handle the unique ID problem on import?

Each REA XML entry has a unique ID but I didn’t want to import already existing entries if there were no changes each time the import schedule was run, that would be crazy.

What I did was use the unique ID combined with the modified date to determine if the entry was new. This significantly improved import speed if say 10 records are in the current file and only 1 changed, the importer skipped 9 and only imported one entry.

How stable is your script?

I’ve been using the script on several real estate websites over the past few years and only added the geocode function last year while developing our own property plugin. The script has worked 100% without any need for monitoring once the cron job is setup.

Solution was to create a REA XML parser that was fast, efficient and did not import already existing property records if no changes were made

Having been so focused on building real estate websites for clients I never thought of offering this easy to use script that can be installed quickly and easily for your clients websites. Also included with the script is the import settings so you can quickly setup WP-Property for your client or use WP ALL Import for a more custom setup. 

Let me know in the comments below if this REA XML script is of interest to you so I can provide some documentation on its ease of use and setup.

Updated Post

How to Use FeedSync to process your REAXML feed

Click here to purchase and read more about what FeedSync can do for you

Comments

  1. Hi Merv,
    This sounds like the solution I’ve been looking for.
    I’m developing a website for a client, and have been having trouble with AgentPoint and Realestate.com.au integration with the site.
    Is this plugin available. Are you willing to share?
    How can I get my hands on it?
    Your help and expertise is greatly appreciated.
    Darren

  2. Hi Merv,
    Very interested in finding out more about how to install your plugin and your recent updates.
    Thanks for your help,
    Tim.

    • Thanks Tim, i’ve had a lot of interest in the REAXML FeedSync application and am wrapping the code into an easy to install package. I’ve sent you an email with some details.

    • Sure will Anthony, I’ve wrapped the application with a GUI with a quick help file on how it works, I’ll email the beta link for you to check it out and follow up this post with the details.

    • Just finishing up some details and adding a GUI and disable geocode option, great for initial imports with over 2500 records, which happens to be Google’s API daily limit for non account holders.

  3. Hi Merv,

    This looks really interesting, would appreciate if you could email me some information on this.

    Thanks

    • I’m just putting the finishing touches on the app and setting up a demo page and will let you know when its ready. It also now supports and REAXML elements. Others to follow.

  4. Since writing this post i’ve been putting together an application that is easy for you to setup on your server to use. Before it was a basic echo output when a feed was sucessfully processed but that is not the best way for others to easily setup and use.

    I’ve re-written the software, wrapped it in a GUI using bootstrap so you can show your clients a great looking application that is managing their feed and powering their property system.

    We’ve named the application FeedSync.

    Check out the FeedSync demo page at http://www.realestateconnected.com.au/feedsync/demo

    Upload your XML test files to:
    ftp.realestateconnected.com.au
    Login: feedsyncdemo@realestateconnected.com.au
    Password: feedsyncdemo

    Once you have uploaded the test files visit the following URL and press Process Feed
    http://demos.realestateconnected.com.au/feedsync/

    What will happen is the feedsync merge your xml data into the outputs folder which you can access from the demo link above.

    I recommend uploading a few test files. I will purge the input/output and processed folders when I can.

    * Geocoding has been disabled for the demo.

    Once you are happy with the software jump over and purchase a license for $87 for end site this is so I can provide you support for the software.

    http://www.realestateconnected.com.au/feedsync/

  5. Hello Merv, I am a beginner developed, but i still accepted to develop a property-listing website on WordPress, using an external XML file, and now after a long research I am little bit desperate – I don’t manage to get the parsing working. It is not a REAXML type, it is a french real estate listing, but looks much similar. Here is the file http://clients.ac3-distribution.com/office8/courgeon_immobilier/cache/export.xml . Do you think it is possible for me to customize your plugin to adapt to my situation?
    Thank you in advance!

    • Hi Maria,

      Thanks for asking, I too remember the feeling of dread the first time I built a Real Estate website that needed to import data, so lets see if I can help.

      Yes FeedSync can be adapted to suit your French xml file as it reads specific xml elements which can be renamed, in your case each property element is <BIEN>. The question I have is, do you receive multiple xml files or a single file containing all the properties?

      Single continually updated file
      If you only have a single file to deal with repeatedly then you won’t require FeedSync as this is designed to merge say export_1.xml, export_2.xml, etc into a single file that you can then import into wordpress. So if they are continually updating the export.xml file you can just use WP-All Import to import the property data into WordPress. But i’m guessing this in not how it works.

      Multiple export files eg: export_3.xml, export_2.xml, export_3.xml
      FeedSync is designed to merge the multiple xml files into one file that you can then import. The way FeedSync works is it compares the elements included in the xml files. So yes it can be adapted to suit your xml files.

      REAXML Standard looks like this
      http://connectedrealestate.com.au/sites/feedSync/XML/commaus_2014-03-20_11-22-20_5139.xml

      Main XML container element
      <propertyList date=”2014-03-20-11:22:04″ username=”commaust” password=”commaust”>

      Property Element
      <rental modTime=”2014-03-20-11:06″ status=”current”>
      or
      <residential modTime=”2014-03-20-11:06″ status=”current”> and a few more

      Property Unique ID
      <uniqueID>737282</uniqueID>

      French XML file

      Main XML container element
      <LISTEPA>

      Property Element
      <BIEN>

      Property Unique ID (not 100% sure but is this the property unique ID?)
      <AFF_ID>10645524</AFF_ID>

      So the solution if you are dealing with multiple files would be to rename the selectors contained inside FeedSync processor file to match your XML file. Can you give me any more xml files for your French listing site? Your xml file has an element called <VENDU>0</VENDU> which my guess means it is not a sold property, it it was sold that sould be <VENDU>1</VENDU>. FeedSync uses a selector called status=”current” (not sold) and status=”sold” so with a bit of modifications it can be adapted.

      Send me some more test files so we can check and see what we can do to help, and you can assist me in adding a FeedSync French version.

  6. Hello Merv, thank you very much for your reply and the information you give.

    Actually, for this project there is only one XML file, so I guess I will work with WP-All-Import. Anyway, I will contact you by e-mail with the details about the service I use so you could include it in your project.

    All the best!

  7. Hi Merv,

    I’ve tried to acces the demo, but the ftp domain seems like is broken.

    Would you be able to email me with some login details?

    Thks!

    • Hi Daniel,

      Thanks for pointing that out. WordPress added http before the ftp details and I didn’t notice that so this is what is causing you an issue with the login.

      Ftp Details are:
      ftp.realestateconnected.com.au
      Username: feedsyncdemo@realestateconnected.com.au
      Password: feedsyncdemo

  8. Hey Merv,

    I’m building a site currently for a real estate company.
    They have organised the xml side of things but how do I get the xml files from my ftp location?
    Then how are they sorted to, for sale, for rent, commercial?
    And how does your application fit into this?

    • Hi John

      What we usually do is create a folder on the server for the XML files where the feed provider adds them. FeedSync is designed to merge the files that they upload sometimes 4 files per day.

      In order to import xml files you need to specify a single file and this is what feed sync does, it merges the files so you can import.

      Inside the REAXML files each listing info is stored in an XML element eg

  9. HI Merv

    A few years ago I built a real estate site (Theme Lighthouse) and recently RealEstate.com.have asked for us to process their REA XML file

    I am unsure how to do this then saw your posts.
    Is this website theme compatible for receiving their file?

    cheers leonie

    • Hi Leonie,

      Yes FeedSync processes the REAXML files so they can be imported into WordPress. We use WP All Import to import the processed REAXML files into WordPress.

      We’ve also developed a plugin Easy Property Listings that you can use to manage your listings. It is full REAXML compatible.

      So what you will need:

      1. Install Easy Property Listings
      2. Install FeedSync to process the REAXML files.
      3. Install WP All Import.
      4. Grab our WP All Import Scripts that are already configured for Easy Property Listings.
      5. Configure Cron jobs on your cPanel for FeedSync and WP All Import to run a few times per day.

  10. Hi Merv,

    I have recently joined a very small real estate agency and they have practically no IT infrastructure in place. (I am an ex-developer.)

    They have a website that is managed in Joomla (written by a relative) and they manually load to it and to realestate.com.au.

    I am thinking of writing an app that will handle basic office admin and want to include a way to update multiple websites. I have discovered REAXML but have no idea where this data stream comes from. Is it created by each real estate office’s internal apps?

    Does FeedSync work with realestate.com.au (for submitting listings)?

    Also, am I correct in assuming that because we are using Joomla on a customer site, that your FeedSync will not work with it?

    Cheers.

    • Hi Mark,

      Where does REAXML come from?
      The REAXML feed comes from a provider like MyDesktop, Box&Dice and several other providers. Realestate.com.au only accepts feeds from one of these providers and not individual offices, only manual entry. MyDeskop has a Lite account so your office can manage all their listings from one place and submit them to the various property portals and Realestate.com.au. Your office will require an account with each portal, some free many paid.

      Then that feed can be sent to your site via FTP and you can use FeedSync to process the files ready for import.

      Read this post to understand the steps except you will need to replace some steps for Joomla. Particulary WP All Import imports files into WordPress, you’ll need to find a Joomla xml importer or write your own.

      FeedSync for Submitting to Realestate.com.au
      Unfortunatley Realestate.com.au requires a feed provider like MyDeskop to supply the REAXML files, they do not accept REAXML feeds from individual offices.

      Will FeedSync will for us?
      Yes it processes the files so they can be imported repeatedly, but you’ll need to write an importer or find one. (I could not find an importer, but you may). It is a stand alone script that operates outside of any CMS like WordPress and Joomla.

      It may be much easier for you to re-create your site in WordPress use FeedSync, WP All Import and Easy Property Listings as creating an importer may cost/take more than its worth.

      Hope this helps you understand the process.

  11. Hi Merv

    I stumbled on this page after searching for a solution for my Real Estate client. They are currently using a site created by Agenpoint in WordPress that integrates myDesktop. From what I read and heard… mDesktop does not integrate too well with WordPress.

    Having a read through FeedSync… does this offer the integration with mDesktop and WordPress?

  12. Hi Merv,

    Is the property listing styles fully customizable?? I mean to say that will it be able to cope up with the style of the theme i will be using for my WP site??

    Kind Regards,
    Vishal

  13. Hi Merv,

    Purchased the feedsync app yesterday. Followed the instructions as per your video and instructions> Websites shows up and so does all the xml files i am getting from the feed provider> When i press process feed it says “Operation Completed” but i don’t see any processed files anywhere. Not sure what’s wrong.

    Url: http://eastcoastbaysrealestate.co.nz/XML/feedsync/

    Your help will be much appreciated.

    Kind Regards,
    Vishal

    • Hi Vishal, you are having the same issue that Yusuke above is having. Your server must allow file writing and script execution. Often security settings on the server host settings prevent this.

    • Hi Yusuke,

      You have a permissions/security issue on your server hosting that is preventing file writes. See the help page on your FeedSync install under

      Litespeed and Apache users

      In order for the plugin to work you will need to disable a security setting in Litespeed web server called ‘Script Restricted Directory Permission Mask’ this security setting prevents uploaded scripts in a directory from being executed. Once disabled the pluign will operate.

  14. Hey Merv, I’m liking the sound of your plugin.

    Please help me with a few questions:
    1. What happens when a listing in the Realestate.com.au feed is marked as sold and deleted from the feed – does the copy on the WordPress site get deleted as well?
    2. Can the listing be retained in the database?
    3. Can the listing be marked it as sold and a ‘sold’ class be appended to it?

    That’s all I’ve got for now. Chat soon,
    Ev

  15. Hi Merv,

    I am about to develop a site which isn’t using wordpress. Is your application still able to work well in this scenario? And I just played around with your demo, http://www.realestateconnected.com.au/demos/feedsync/lib/feedsync-processor.php, it looks like it is combining all feeds into three large feeds, instead of inserting into database. In this case how does the search feature work? Do you know if your application is working well with feeds sent out from MyDesktop?

    Thanks Merv looking forward to your response

      • Hi Andrew, FeedSync creates new XML files by overwriting the old entries. When it was developed we created it to solve an import issue for another plugin and direct to database was not possible.

        We are planning on a direct to WordPress importer but WP All Import has that covered as it can handle any worldwide real estate format into Easy Property Listings.

        What you could do is append an additional script to import into your databases and use our code as the record matching part.

        We are about to release the next update which is faster and also provides a single file output option too.

        Some REAXML formats include zip files with included jpg images which is supported by FeedSync.

        FeedSync uncompresses the zip-files, moves the jpgs to a processed folder and adds a valid URL to the images in the XML files during processing

        • Thanks Merv. If I purchase the current version now, will I be eligible for a free upgrade when the new version releases? And do you sell any scripts which will import into database? Or read the xml and output to the page directly?

  16. Feedsync works so well with MantisProperty (www.mantisproperty.com.au)

    We have a number of Agents using MantisProperty to manage their listings which then feed to their wordpress website and are automatically processed by FeedSync

Leave a Reply