Difference between revisions of "Rewrite PageCreationBot"
(flagged as noextra) |
|||
(18 intermediate revisions by 9 users not shown) | |||
Line 1: | Line 1: | ||
− | <noinclude><big>[[OurWork]] < [[DevelopmentTeam]] < [[DevelopmentTeamPriorities|Priorities]] < </noinclude>('''2''') [[Rewrite PageCreationBot]] ('''[[ | + | <noinclude><big>[[OurWork]] < [[DevelopmentTeam]] < [[DevelopmentTeamPriorities|Priorities]] < </noinclude>('''2''') [[Rewrite PageCreationBot]] ('''[[Mohammad Ghufran|Ghufran]]''', '''[[Umar Sheikh]]''') {{JustTinyEditIcon|Rewrite PageCreationBot}}<noinclude></big> |
__NOTOC__ | __NOTOC__ | ||
== What (summary) == | == What (summary) == | ||
Line 6: | Line 6: | ||
* Still relies on Java/Tomcat to do crawling (for now) | * Still relies on Java/Tomcat to do crawling (for now) | ||
* Carefully tested | * Carefully tested | ||
+ | |||
+ | == Current Status == | ||
+ | * <s>Creates new pages based on a template</s> | ||
+ | * <s>Monitoring and Logging has been added</s> | ||
+ | * <s>Test cases added</s> | ||
+ | * We have created a sample page which is a rough sketch of how a page looks like after being created by the bot. [[PageCreationBot_Sample | Here...]] | ||
+ | * The current version of the PageCreationBot is not using the thumbnail extracted from Alexa. It is currently using the thumbnail tag being used in the Domain_Page template. | ||
+ | ** This can be changed by using the get_thumbnail function that is already in place. | ||
== Why this is important == | == Why this is important == | ||
− | + | * We need to have control over the pages that are created on our site. | |
− | * We need to have control over the pages that | ||
* The old bot was known to pollute the database; we need control over all the access points that could screw up our data. | * The old bot was known to pollute the database; we need control over all the access points that could screw up our data. | ||
* Gaining mastery over the code so that we can add new features easily. | * Gaining mastery over the code so that we can add new features easily. | ||
− | |||
== [[DoneDone]] == | == [[DoneDone]] == | ||
* Creates news pages based on a template | * Creates news pages based on a template | ||
* Monitoring and logging have been added (tests whether or not the bot succeeds) | * Monitoring and logging have been added (tests whether or not the bot succeeds) | ||
+ | ** Output to a log file. Either on each squal box (with aggregation) or an NFS volume. Have emailed Ethan and Michael about this. | ||
* Hooked in to all the old points Bot was | * Hooked in to all the old points Bot was | ||
+ | ** Not exactly the same points, but the same end-user functionality. | ||
* [[Projects:BotTest]] problems fixed | * [[Projects:BotTest]] problems fixed | ||
== Bot insertion points into Mediawiki == | == Bot insertion points into Mediawiki == | ||
− | * /wiki/skins/common/generatePage.js (and some other javascript that we should remove) | + | * <strike>/wiki/skins/common/generatePage.js (and some other javascript that we should remove)</strike> |
− | * /wiki/extensions/AboutUsDomainRedirect/SpecialRedirectToDomain.php (deprecate and point to CaseSpace) | + | * <strike>/wiki/extensions/AboutUsDomainRedirect/SpecialRedirectToDomain.php (deprecate and point to CaseSpace)</strike> |
− | * /wiki/extensions/CaseSpace/CaseSpace.php (Ultimately, here is where the magic will happen.) | + | * <strike>/wiki/extensions/CaseSpace/CaseSpace.php (Ultimately, here is where the magic will happen.)</strike> |
+ | * /wiki/extensions/AboutUsBuildDomain/AboutUsBuildDomain.php should be the best place to keep it. | ||
− | [[Category: | + | == Schema == |
+ | * New schema location http://images.aboutus.org/images/b/be/Aboutusbot_new.zip. Its an sql file and not a compressed one. | ||
+ | ==Discussion== | ||
+ | * I heard rumor of a possible change in format for new pages. Is this true? Where is the discussion about the new format possibilities happening? [[User:TedErnst|TedErnst]] | <small>[[User talk:TedErnst|talk]]</small> 13:50, 25 October 2007 (PDT) | ||
+ | * I think that the bot is still using <nowiki><graphic></nowiki> tag instead of the tag <nowiki><email></nowiki> with the new name. Please correct me if I'm wrong. :) {{IconSig|Vartan|17:21, 25 October 2007 (PDT)}} | ||
+ | [[Category:OpenTask]] | ||
+ | [[Category:DevelopmentTeam]] | ||
</noinclude> | </noinclude> |
Latest revision as of 11:31, 19 December 2013
What (summary)
- New page-building bot
- Still relies on Java/Tomcat to do crawling (for now)
- Carefully tested
Current Status
-
Creates new pages based on a template -
Monitoring and Logging has been added -
Test cases added - We have created a sample page which is a rough sketch of how a page looks like after being created by the bot. Here...
- The current version of the PageCreationBot is not using the thumbnail extracted from Alexa. It is currently using the thumbnail tag being used in the Domain_Page template.
- This can be changed by using the get_thumbnail function that is already in place.
Why this is important
- We need to have control over the pages that are created on our site.
- The old bot was known to pollute the database; we need control over all the access points that could screw up our data.
- Gaining mastery over the code so that we can add new features easily.
DoneDone
- Creates news pages based on a template
- Monitoring and logging have been added (tests whether or not the bot succeeds)
- Output to a log file. Either on each squal box (with aggregation) or an NFS volume. Have emailed Ethan and Michael about this.
- Hooked in to all the old points Bot was
- Not exactly the same points, but the same end-user functionality.
- Projects:BotTest problems fixed
Bot insertion points into Mediawiki
-
/wiki/skins/common/generatePage.js (and some other javascript that we should remove) -
/wiki/extensions/AboutUsDomainRedirect/SpecialRedirectToDomain.php (deprecate and point to CaseSpace) -
/wiki/extensions/CaseSpace/CaseSpace.php (Ultimately, here is where the magic will happen.) - /wiki/extensions/AboutUsBuildDomain/AboutUsBuildDomain.php should be the best place to keep it.
Schema
- New schema location http://images.aboutus.org/images/b/be/Aboutusbot_new.zip. Its an sql file and not a compressed one.
Discussion
- I heard rumor of a possible change in format for new pages. Is this true? Where is the discussion about the new format possibilities happening? TedErnst | talk 13:50, 25 October 2007 (PDT)
- I think that the bot is still using <graphic> tag instead of the tag <email> with the new name. Please correct me if I'm wrong. :) Vartan 17:21, 25 October 2007 (PDT)