Re: XML import in to Jobberbase MySql database tool
any updates? I have been looking for a way to import a csv of jobs for months, glad to see development!
Welcome to jobberBase Developer Community!
You are not logged in. Please login or register.
We've just added a new category for forums dedicated to local communities.
If you're interested in starting a community in your country or even city, please write us at hello@jobberbase.com and we'll make it happen.
We're also looking for moderators on each local community -- so email us :).
any updates? I have been looking for a way to import a csv of jobs for months, glad to see development!
Hey Steve!
Do you've got any new updates or progress on this subject? Would love to see it working!!
any update on this ?
It would be nice to see the rss import feature created by RedJumpsuit in next version.
I have used it for over a month now and it works like magic.
Right now It uses rss feed from one source only and all the work of adding rss feeds has to be done by admin.
Allowing users(employers) to submit their rss feeds for each category would be a very nice improvement.
Hey Hobo,
Where can I find this post of redjumpsuit? I've looked at his site but I didn't find this topic could you be so kind to post a link? Thanks in advance!!
Chris
Last edited by chrisdegrote (2009-08-17 19:26:23)
hi, this is not listed on any of my posts as it was a custom work i did for hobo. the thing there is it's "customized", it's pulling data from 1 source only and so the code is specific to that particular site.
haven't found a solution yet to make it work for just any xml/rss feed
i can only point once again to my prior post whithin this thread and to a nifty rss-import demonstration video by the pligg developer itself.
@redjumsuit
Thanks for clearing that up I thought I've missed something.
@uniq sorry for being blunt but...
I know of course of Magpie (I can google) but the documentation is very poor and I believe it's also left alone by the developer because important links don't work like the forum/blog. And sorry but the video you've posted is sweet and makes my mouth water but what kind of added value is this for jobberbase? Sorry for my assumption but I simply don't see the link between the two CMS systems.
give simpliepie a try. this is what i used for hobo's website and it works like charm!
I'm also keen to have an update...
Has SteveSPI made any progress with this? Or anyone else for that matter?
Thanks,
Matt
@chris
i've got the rss import of pligg modified in such a way that it imports neatly into my db - that's the link for me...
Hey Uniq and others,
Well I've looked at Pligg and downloaded and checked the rss_import module. It truly looks solid (shown in the link of Uniq) but now I've got not an idea where to start to implement this properly in a Jobberbase database.
I really would like to see the steps you have to take to make it communicate with the Job table of our JB databases. And what kind of query you have to execute (if any) in phpadmin/mysql?
To what kind of folder do you've got to upload the module (root/rss_import) and what are the other files of the Pligg base files are also nescessary?
And also what do you've got to edit in the files of the folder rss_import?
I think a lot of users of Jobberbase would be very excited to see this working and make the sites more useful for their visitors.
Sorry for being such a pain in the butt ![]()
Chris
Hey all,
Here's a small and a dissapointing update on how to integrate a rss/xml feed on my site.
Simplepie looks like it would be the program to make it all happening. If you go to their site http://simplepie.org/ you can download their program and put the (whole) file in your root folder. Firstly check out several test pages especially designed to see if everything (serverside etc.) is working properly.
The files you should look for in the dowloaded SimplePie files are test/test.php & compatibility_test/sp_compatibility_test.php .
When that's all okay you should look for demo/multifeeds.php. This is a file that makes it more insightfull to see how the multifeeds work.
If it doesn't work you should look here for the fix. http://tech.groups.yahoo.com/group/simp ssage/4240
I've asked on their support forum/group http://tech.groups.yahoo.com/group/simp ssage/4323 if it was possible to import the feeds in our Jobs table (MySQL) to make them seamless integrate with the other jobs. Unfortuanelly it desires PHP programming knowledge to make that possible. This is something I don't know how to do. if somebody knows this I would really like to know how to make this all work.
Of course you can make an extra field on your index page with multiple EXTERNAL feeds. But for me that's not the desired effect that I want. I would like to give the visitor the idea the feed is part of the site and not of an external site.
Thanks Chris
PS a nice video on how SimplePie works
http://css-tricks.com/video-screencasts
simplepie/
Last edited by chrisdegrote (2009-08-27 11:50:27)
hi chris, sorry for the long wait, i'll try to write up a little how-to, although please bear with me as i'm loaded with work...
Hey Uniq!
That would be great if you can find the time to post your how to. I would really appreciate your effort.
Looking forward to you're how to
Cheers Chris
Soundsinteresting. If it works add it to JB2.0?
I guess there's been no tangible progress... or at least none that is being posted to here? So I'll try and revive this board forum.
Would someone knowledgeable in the integration of SimplePie be able to use what redjumpsuit has suggested? He said it was pretty easy to integrate a single feed with Jobberbase. So if that is possible, can't SimplePie just take a load of feeds, mash them together and provide the output as a single file that can then be integrated using redjumpsuit's methodology? Or am I over-simplifying things here?
hi matt, that actually makes sense. i never tried it that way, maybe when i get some time (or if anyone is interested in funding this for the community) i'll try to figure out how this can get done.
Just following up the earlier posts... can someone post a quick tutorial about how to incorporate a SimplePie feed in to Jobberbase please?
Merging the feeds in SimplePie has proved to be easier than I thought: http://simplepie.org/wiki/reference/sim erge_items
I'm guessing the next stage will be the requirement to establish cron jobs to schedule this type of import...
Would the jobs have to be imported in to the Jobberbase database, or might it be possible to just parse the results into an 'external jobs' section that displays below the main Jobberbase results? I imagine the downside of this is that the jobs are never truly in the Jobberbase SQL database then though...
Ultimately, I would like to be able to accept XML feeds from the multi-job board posting tools such as idibu, broadbean or conkers (in the UK) - so, any help would be hugely appreciated - thanks!
Currently, the jobs go to database, then you delete them as regular jobs using the maintenance cron (If you got it working).
The code below is from simplefeeds_cronjob.php that goes to root app folder:
<?php
/**
* jobber job board platform
*
* @author Filip C.T.E. <http://www.filipcte.ro> <me@filipcte.ro>
* @license You are free to edit and use this work, but it would be nice if you always referenced the original author ;)
* (see license.txt).
*/
require_once 'config.php';
require_once '_includes/simplepie.inc';
require_once '_includes/class.FeedToDB.php';
// select all feeds that are active
$sql = 'SELECT * FROM feed_db WHERE is_active = 1';
$result = $db->query($sql);
while ($row = $result->fetch_assoc())
{
if (isset($row['url']))
{
$url = $row['url'];
$feed = new SimplePie();
$todb = new FeedToDB();
$data = array();
$feed->set_feed_url($url);
$success = $feed->init();
$feed->handle_content_type();
// default starting item
$start = 0;
// default number of items to display. 0 = all
$length = 0;
// if single item, set start to item number and length to 1
/*
if(isset($_GET['item']))
{
$start = $_GET['item'];
$length = 10;
}
*/
// set item link to script uri
$link = $_SERVER['REQUEST_URI'];
if ($success)
{
$type_id = $row['type_id'];
$category_id = $row['category_id'];
$company = $row['company'];
$poster_email = $row['poster_email'];
$database = $row['database'];
// loop through items
foreach($feed->get_items($start,$length) as $key=>$item)
{
// set query string to item number
$queryString = '?item=' . $key;
// if we're displaying a single item, set item link to itself and set query string to nothing
if(isset($_GET['item']))
{
$link = $item->get_link();
$queryString = '';
}
// display item title and date
echo '<a href="' . $item->get_permalink() . '">' . $item->get_title() . '</a>';
echo ' <small>'.$item->get_date('j M Y, H:i:s O').'</small><br>';
echo ' <small>'.$item->get_content().'</small><br>';
echo ' <small>'.$item->get_permalink().'</small><br>';
echo '<br>';
$permalink = $item->get_permalink();
$title = get_title();
$rssdate = $item->get_date();
$content = addslashes(trim($item->get_description()));
$location = get_location();
$url = $permalink;
$desc = strip_tags(trim($content));
/*
echo "loc:" .$location ."<br/>";
echo "url:" .$url ."<br/>";
echo "desc:" .$desc ."<br/>";
*/
$data = array('database' => $database,
'type_id' => $type_id,
'category_id' => $category_id,
'title' => $title,
'description' => $desc,
'company' => $company,
'url' => $url,
'location' => $location,
'created_on' => $rssdate,
'poster_email' => $poster_email);
$todb->save($data);
}
}
else
{
// Check to see if there are more than zero errors (i.e. if there are any errors at all)
if ($feed->error())
{
// If so, start a <div> element with a classname so we can style it.
echo '<div class="sp_errors">' . "\r\n";
// ... and display it.
echo '<p>' . htmlspecialchars($feed->error()) . "</p>\r\n";
// Close the <div> element we opened.
echo '</div>' . "\r\n";
}
}
}
echo "<strong>Feed upload for ". $row['database'] ." - ". $row['category_label'] ." completed.</strong><br /><br />";
}
?>
Then the in _includes folder save following as new file named class.FeedToDB.php
<?php
/**
* jobber job board platform
*
* @author RedJumpsuit <myredjumpsuit@gmail.com>
*
* FeedToDB class handles feeding RSS/XML feed to MySQL
* Thanks to SimplePie :)
*
*/
class FeedToDB
{
function __construct()
{ }
public function save($data)
{
global $db;
// category
$dbname = $data['database'];
$category_id = $data['category_id'];
/*
if (!is_numeric($category_id) && strstr($category_id, "|"))
{
$cats = array();
$cats = explode("|", $category_id);
$dbname = trim($cats[0]);
$category_id = trim($cats[1]);
}
*/
$sqldup = 'SELECT * FROM '. $dbname.'.jobs WHERE url = "'. trim($data['url']) .'"';
$resultdup = $db->query($sqldup);
$rowdup = $resultdup->fetch_assoc();
if (!$rowdup)
{
$sqlcity = 'SELECT id FROM '. $dbname.'.cities WHERE name LIKE "%'. $data['location'] .'%"';
$resultcity = $db->query($sqlcity);
$rowcity = $resultcity->fetch_assoc();
if (is_numeric($rowcity['id']) && $rowcity['id'] > 0)
{
$city_id = $rowcity['id'];
$outside_location = '';
}
else
{
$city_id = -1;
$outside_location = $data['location'];
}
$sql = 'INSERT INTO '. $dbname.'.jobs (type_id, category_id, title, description, company, city_id, url, created_on, is_temp, is_active,
views_count, auth, outside_location, poster_email, apply_online)
VALUES (' . $data['type_id'] . ',
' . $category_id . ',
"' . $data['title'] . '",
"' . $data['description'] . '",
"' . $data['company'] . '",
' . $city_id . ',
"' . $data['url'] . '",
NOW(), 0, 1, 0, "' . md5(uniqid() . time()) . '",
"' . $outside_location . '", "' . $data['poster_email'] . '", 0)';
$db->query($sql);
}
else
{
echo "<strong>FAILED: This Job already exists! </strong><br /><br />";
}
}
?>
You will also need to put simplepie.inc and /idn/ folder inside your _includes folder, download latest copies of these two from http://simplepie.org/downloads/
Feed urls go to database and you paste this sql into your phpmyadmin:
1. Create this table in on your jobberBase db
--
-- Table structure for table `feed_db`
--
CREATE TABLE IF NOT EXISTS `feed_db` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`database` varchar(255) NOT NULL,
`is_active` int(4) NOT NULL DEFAULT '0',
`type_id` int(4) NOT NULL,
`category_id` int(4) NOT NULL,
`category_label` varchar(255) NOT NULL,
`company` varchar(255) NOT NULL,
`poster_email` varchar(255) NOT NULL,
`url` text NOT NULL,
PRIMARY KEY (`id`)
) ENGINE=MyISAM DEFAULT CHARSET=utf8 AUTO_INCREMENT=333 ;
--
-- Dumping data for table `feed_db`
--
INSERT INTO `feed_db` (`id`, `database`, `is_active`, `type_id`, `category_id`, `category_label`, `company`, `poster_email`, `url`) VALUES
(1, 'yourdbname', 0, 0, 3, 'Name For The Incoming Rss Feed', 'company name where you get feed from', 'Email-needs-changing-If-left-out-use-apply-now-mod@localhost.com', 'http://www.example.com/anothejobberbaserssfeed/'),
(2, 'yourdbname', 0, 0, 45, 'Jobs for testers rss', 'jobberbase.com', '', 'http://www.example.com/anothejobberbaserssfeed1/');
As long as you do not make a typo this list can be very long.
You can add different feeds to the same category by using the same category number:
(1, 'yourdbname', 0, 0, 3, 'Name For The Incoming Rss Feed', 'company name where you get feed from', 'Email-needs-changing-If-left-out-use-apply-now-mod@localhost.com', 'http://www.example.com/anothejobberbaserssfeed/'),
(2, 'yourdbname', 0, 0, 45, 'Jobs for testers rss', 'jobberbase.com', '', 'http://www.example.com/anothejobberbaserssfeed1/'),
(3, 'yourdbname', 0, 0, 45, 'Jobs for senior testers rss', 'jobberbase.com', '', 'http://www.example.com/anothejobberbaserssfeed2/');
45 is the Testers category number, you can find this number in your jobberbase database.
I have removed some codes from the original that I am using, this may or may not be able to grab feeds from other jobberbase as is. If it does not work, fix it and post solution ![]()
Last two lines come directly from Redjupsuit tutorial I got in my e-mail a while ago for this mod:
"8. On line 15 of 'simplefeeds_cronjob.php', it filters all the Feed URL to those that are set to 'is_active = 1', so you must set the `is_active` field in `feed_db` table to 1 if you want that feed to be grabbed by the cron job.
9. Setup the cron job to run the simplefeeds_cronjob.php file with the interval you wish"
It would also be useful to be able to easily filter for outgoing feeds posted on site from the ones you are getting in from third parties (via this rss mod), this way, even if you already have all the jobs I have, (We both got them from same third party), I will still want to accept any of these filtered feeds you wish to provide. Of course the quality of jobs you wish to provide via this filtered rss may also play a big role.
If jobs are not filtered this way I will get lots duplicate content and any co-operation is nearly impossible.
What do you think about this?
Don`t bother to install it , it doesn`t work. But i don`t understand one thing , why to show a deliberately broken code ? For me it sounds like mockery.
Last edited by dre (2009-11-06 23:03:03)
hi, what hobo showed is a lead on how it can be done, so others can see how they could implement it. trust me it works, you can check his site. so to just copy/paste the code and expect it to work? i don't think so. maybe if you are willing to invest $$ on a custom work (as he did) then you'll get what you want.
while other developers can afford a day or two of full day's work to create a custom code and give it away for free, others can't (like me) so you can't just expect someone who paid a developer's time to give his investment away for free ![]()
peace!
I understand that , but is better not to show anything that to show 30% or code and then having fun on the back of the other users who are looking for the needle is the haystack. I am working on a similar script , but when i`m finished i`ll give it away to anywone who needs it.
In the beggining i thought that this forum is great because of the free support and people who like to help others if they can , but now i realise that is just about the money. 'Free' is just a trap for cash.
Hi Dre,
Relax,
It took some time to figure this one out for me, but it was quite simple solution.
This code is tested and works on 1.6, and it is for jobberbase to jobberbase transfer.
It is not complete or bulletproof as it could use some more improvements.
Also can grab any other feed from any website but styling and format may need adjusting. Feel free to ask any questions here if you need help.
All instructions from above are the same except the 2 php files.
1 The code below is from simplefeeds_cronjob.php that goes to root app folder change it to:
<?php
/**
* jobber job board platform
*
* @author Filip C.T.E. <http://www.filipcte.ro> <me@filipcte.ro>
* @license You are free to edit and use this work, but it would be nice if you always referenced the original author ;)
* (see license.txt).
*/
require_once 'config.php';
require_once '_includes/simplepie.inc';
require_once '_includes/class.FeedToDB.php';
// select all feeds that are active
$sql = 'SELECT * FROM feed_db WHERE is_active = 1';
$result = $db->query($sql);
while ($row = $result->fetch_assoc())
{
if (isset($row['url']))
{
$url = $row['url'];
$feed = new SimplePie();
$todb = new FeedToDB();
$data = array();
$feed->set_feed_url($url);
$success = $feed->init();
$feed->handle_content_type();
// default starting item
$start = 0;
// default number of items to display. 0 = all
$length = 0;
// if single item, set start to item number and length to 1
/*
if(isset($_GET['item']))
{
$start = $_GET['item'];
$length = 10;
}
*/
// set item link to script uri
$link = $_SERVER['REQUEST_URI'];
if ($success)
{
$type_id = $row['type_id'];
$category_id = $row['category_id'];
$company = $row['company'];
$poster_email = $row['poster_email'];
$database = $row['database'];
// loop through items
foreach($feed->get_items($start,$length) as $key=>$item)
{
// set query string to item number
$queryString = '?item=' . $key;
// if we're displaying a single item, set item link to itself and set query string to nothing
if(isset($_GET['item']))
{
$link = $item->get_link();
$queryString = '';
}
// display item title and date
echo '<a href="' . $item->get_permalink() . '">' . $item->get_title() . '</a>';
echo ' <small>'.$item->get_date('j M Y, H:i:s O').'</small><br>';
echo ' <small>'.$item->get_content().'</small><br>';
echo ' <small>'.$item->get_permalink().'</small><br>';
echo '<br>';
$permalink = $item->get_permalink();
$title = addslashes(trim($item->get_title()));
$rssdate = $item->get_date();
$content = addslashes(trim($item->get_description()));
$location = trim(strip_tags($todb->get_string_between($content, "Location:", "<br />")));
$url = $permalink;
$desc = strip_tags(trim($content));
/*
echo "loc:" .$location ."<br/>";
echo "url:" .$url ."<br/>";
echo "desc:" .$desc ."<br/>";
*/
$data = array('database' => $database,
'type_id' => $type_id,
'category_id' => $category_id,
'title' => $title,
'description' => $desc,
'company' => $company,
'url' => $url,
'location' => $location,
'created_on' => $rssdate,
'poster_email' => $poster_email);
$todb->save($data);
}
}
else
{
// Check to see if there are more than zero errors (i.e. if there are any errors at all)
if ($feed->error())
{
// If so, start a <div> element with a classname so we can style it.
echo '<div class="sp_errors">' . "\r\n";
// ... and display it.
echo '<p>' . htmlspecialchars($feed->error()) . "</p>\r\n";
// Close the <div> element we opened.
echo '</div>' . "\r\n";
}
}
}
echo "<strong>Feed upload for ". $row['database'] ." - ". $row['category_label'] ." completed.</strong><br /><br />";
}
?>
And the class.FeedToDB.php in the _includes folder should be changed to:
<?php
/**
* jobber job board platform
*
* @author RedJumpsuit <myredjumpsuit@gmail.com>
*
* FeedToDB class handles feeding RSS/XML feed to MySQL
* Thanks to SimplePie :)
*
*/
class FeedToDB
{
function __construct()
{ }
public function save($data)
{
global $db;
// category
$dbname = $data['database'];
$category_id = $data['category_id'];
/*
if (!is_numeric($category_id) && strstr($category_id, "|"))
{
$cats = array();
$cats = explode("|", $category_id);
$dbname = trim($cats[0]);
$category_id = trim($cats[1]);
}
*/
$sqldup = 'SELECT * FROM '. $dbname.'.jobs WHERE url = "'. trim($data['url']) .'"';
$resultdup = $db->query($sqldup);
$rowdup = $resultdup->fetch_assoc();
if (!$rowdup)
{
$sqlcity = 'SELECT id FROM '. $dbname.'.cities WHERE name LIKE "%'. $data['location'] .'%"';
$resultcity = $db->query($sqlcity);
$rowcity = $resultcity->fetch_assoc();
if (is_numeric($rowcity['id']) && $rowcity['id'] > 0)
{
$city_id = $rowcity['id'];
$outside_location = '';
}
else
{
$city_id = -1;
$outside_location = $data['location'];
}
$sql = 'INSERT INTO '. $dbname.'.jobs (type_id, category_id, title, description, company, city_id, url, created_on, is_temp, is_active,
views_count, auth, outside_location, poster_email, apply_online)
VALUES (' . $data['type_id'] . ',
' . $category_id . ',
"' . $data['title'] . '",
"' . $data['description'] . '",
"' . $data['company'] . '",
' . $city_id . ',
"' . $data['url'] . '",
NOW(), 0, 1, 0, "' . md5(uniqid() . time()) . '",
"' . $outside_location . '", "' . $data['poster_email'] . '", 0)';
$db->query($sql);
}
else
{
echo "<strong>FAILED: This Job already exists! </strong><br /><br />";
}
}
public function get_string_between($string, $start, $end)
{
$string = " ".$string;
$ini = strpos($string,$start);
if ($ini == 0) return "";
$ini += strlen($start);
$len = strpos($string,$end,$ini) - $ini;
return substr($string,$ini,$len);
}
}
?>
It could possibly work on 1.8, but I did not try yet.
To see if it works, once you added the valid category number and valid rss feed to your feed_db database table visit your website at http://example.com/simplefeeds_cronjob.php
Use at own risk and thanks again to Redjumpsuit for coming up with this simple easy to edit code.
Last edited by hobo (2009-11-09 05:54:32)
How to get content from more than one website? You can already do this, since you can get rss feeds in correct format from any jobberbase powered website, but for sites not powered by jobberbase some parameters will be in different places. For example, you could have the location showing in feed title instead of feed content, this would prevent you from sorting these jobs by city in your jobberbase since the default code would not store this data in your database.
The script currently gets location by looking for anything in the description field that is between the "Location:" and the first "<br />. This may not be the case for all feeds and that is why modifications are needed.
One way to work around this is to create a new table in your database, you can use same code as before, just change table name for each new feed format you wish to add, and also make new simplefeeds_cronjob.php file that would be customized to the new feed format you wish to get, you could call it simplefeeds_cronjob2.php, simplefeeds_cronjob3.php
What needs changing in the cron job code for getting new feed format?
For new table you need a new name so in simplefeeds_cronjob2.php replace this line:
$sql = 'SELECT * FROM feed_db WHERE is_active = 1';
with:
$sql = 'SELECT * FROM second_feed_db WHERE is_active = 1';
Where you replace the "second_feed_db" with the table name you assigned to the new table.
The get_string_between function can get the data you need to find in most cases.
Below is the piece of the code from simplefeeds_cronjob2.php that needs modifying.
$permalink = $item->get_permalink();
$title = addslashes(trim($item->get_title()));
$rssdate = $item->get_date();
$content = addslashes(trim($item->get_description()));
$location = trim(strip_tags($todb->get_string_between($content, "Location:", "<br />")));
$url = $permalink;
$desc = strip_tags(trim($content));
When modification are done just setup your two cron jobs so you can get your two different feeds types.
You can see if it works by visiting your cron jobs at:
http://example.com/simplefeeds_cronjob.php
http://example.com/simplefeeds_cronjob2.php
And then you visit your website to see if it worked.
Only problems I run into with this code is when I entered same rss feed url in the database table twice, so don't.
If something needs clarifying, or someone has a better idea on how to do this, feel free to contribute.
Last edited by hobo (2009-11-12 21:09:55)
Page [ 2 of 3 ] Posts [ 26 to 50 of 63 ]
Powered by FluxBB
[ Generated in 0.047 seconds, 7 queries executed ]