User talk:(WT-en) Bill Ellett/sandbox/Rules

These are my notes (work in progress) of what I have distilled from the policies pages and wikivoyage talk pages.

There are exceptions, but generally:

status
Every article in MAIN namespace should have a status. Either star, star nomination, guide, usable, outline, or stub. Most obvious exception is Main page.

Articles with vfd or merge may or may not have a status or type. May not be worth adding if they are on the way out.

Redirect pages and disambiguation pages are also excluded from status or type.

The status templates "guide" and "usable" should be upgraded to templates such as usablecity, usableregion, usabletopic. This affects the text the user sees, so the generic ones are not as good.

Almost all pages in MAIN namespace have a staus. Since the status creates a category, ones which don't have a category will show up in the special pages:uncategorized pages.

type
Every article in MAIN namespace should have a type: Either district, city, region, state, country, continentalsection, continent, park, itinerary, topic, or phrasebook.

Articles with vfd or merge may or may not have a status or type. May not be worth adding if they are on the way out.

Redirect pages and disambiguation pages are also excluded from status or type.

About half of all wikivoyage pages do not have a type specified. Most of these are either cities or regions.

IsPartOf/IsIn
Every article for city, region, state, country, continentalsection, continent, or park, and many for itinerary, should either have isin or ispartof. Not required on distruct articles to get breadcrumbs, but probably desirable for external processing. Should not be on topic or phrasebook.

If a page does not have either isin or ispartof, and if it also does not have traveltopic template, then it is probably either an unidentified topic or it needs to have ispartof created.

How to identify problems for update
First download list of all articles with specific templates. These are all accessed from "special pages: most linked to templates" page, with easily selected link. Add "&limit=5000" to the url if there are lots of links, so you can get 5,000 at a time. Max supported is 5,000.

Second, edit file: remove any link which includes the text "redirect", or a colon. Colon means it is not in MAIN namespace.

Third, sort unique to remove duplicates from the list.

Now you have a list of articles in MAIN which have that template.

Then compare lists. For example, if page is listed in Guide and in Cityguide, then it should be changed to Guidecity and Cityguide. If it is in Usablecity but not in Cityguide or Districtguide, then add the seond template.

If pagename contains a slash, it is almost always a district. If it contins the word Phrasebook then it is almost always a phrasebook.

Before making changes, must look at the specific page oir pagename, but generally:
 * If template cityguide and name contains slash, change to districtguide for almost all cases.
 * If no Type template and name contiains a slash, add districtguide.
 * If no type and either guidecity or usablecity, then add either cityguide or districtguide (does it have a slash?).
 * if no type and either guideregion or usableregion, then add regionguide (remember, already took care of continents, states, continental sections).
 * Continue for each type/status combination.


 * Remember to set outlineitinerary or outlinephrasebook for those two types, all the other types are just "outline"


 * For the Guide, Usable, and Outline templates, first get the full list of pages. For guide and usable, if they also have a type template then change to usablepark, guidecity, or such. For outline, just remove them from your working copy of the list. Now look for slashes and add districtguide; look for phrasebook in title and add outlinephrasebook, etc. Almost the only things left are city, region, park, itinerary, and topic. Now read the list of page names. Topics generally pop out at you by name. Itineraries usually will also pop out. So all you have left are city, region, and park. Look at pagename for keyword like park or monument, and you'll get lots of parks, but not all. And beware of cities like Highland Park (Illinois) which is just waiting to catch you. By now, you'll have almost everything out of the list except cities and regions, and a few oddball pages.

Already done
I have tried manually to look and update all continent, continental section, country, and state pages so that they have the specific type template. Happily, these are not normally created new. So if it is bigger than a city and has no type template, then it is either a region or a park. This normally meant adding Countryguide to members of the United Nations, looking at the list of countries or continental sections we have on out continent pages, and list of countries on the continental section pages. Then updating templates on those countries as needed.

If you have something smaller than a region and it does not have a slash, then it is either a city or a park.

Traveltopic tend to be easily identified, usually just by reading the pagename.

Phrasebooks should all have "phrasebook" in the pagename. (of course there's the devengali exception). So search gogle for "intitle:phrasebook site:wikivoyage.org/en" to find the full list.

I took the "list of itineraries" page, "travel topics" page, and "list of phrasebooks" page, and extracted all the linked pages. Then matched those to the templates, and added or updated the templates as needed. I then compared the list of links to the list of pages with the traveltopics (etc) templates, and added missing pagenames to the list pages. However, we've found a batch more itineraries and topics since then so probably need to do this again soon.

countries
Proposed definition:
 * If member state of UN then it is a country.
 * If the area is considered by a good part of the world as a country, then it is. Vatican. Palestine.
 * If the area has friendly relations with another place, then compare closeness to Puerto Rico. Puerto Rico is not a country. Falkland Islands is not a country. But even though Canada has close relations with England, it is a country. What about Greenland?
 * If the area has unfriendly relations with somebody who claims it, then ask what the traveler should do. Who do you go to for a visa? Who runs the border crossings? Who runs the police dept? Who collects the taxes? The politicion may have all sorts of ideas but what is the effect on the traveler. It should also be stable. In some cases the political world still hasn't accepted countries who declared independence 20 years ago, but for the traveler, they are successful "countries".

districts
There are a number of articles with slash but first word is "diving in". These of course are not districts. There are also a few other oddballs, such as cityname/Archive or cityname/deleted_listings.

There were some pages which have districtguide template but pagename has no slash. Most of these turned out to be cites, and were changed to cityguide. Some are districts, and some are who the heck knows what. The ones which are districts probaly should be renamed after discussion. Or proposed for vfd.

regions vs cities
generally has to have city articles within it. If it is an island with a few small communities then it gets city template, not region template. This is important when we get to usable and above, since usableregion and usablecity print very different messages.

park vs region
If it has fees and hours, then it is from the park template. If it includes towns then it is a region. If it has a single administrative head or owner then it's a park.

Redirect problems
If ispartof points to a redirect page, then the breadcrumb is broken. Change to the real page.

The disambiguations special page lists about 4,000 pages. In ideal world, this would be zero. These are pages where a link in the artoicle is to a disambiguation page. But if you are in durango colorado and link to silverton, the user should go straight to silverton (Colo), not to a disambig page where they have to pick between silverton oregon or silverton australia. There are some good links, where a redirect page points to a disambiguation page. But most of the 4000 are problems.

If a redirect page includes text, then that text will be searchable, and the page will be in the user's hitlist for their searchh. But the user can't see the text! It remains hidden unless they go to edit mode or other extraordinary condition. So if you find text on a redirect page, it should be compared to the text in the redirected page, and missing inforamtion should be added. Then delete the text from the redirect page. Generally, all it should have is the redirect template. When it was redirected, it may be that the redirector did not have the time or inclination to merge the data so they just left it behind. And if they did merge the information then it is a quick process to just delete the remaining text. Many of these pages are quickly identified in the display of pages linked to a template. They will have the word "redirect" following the pagename. Happily, it is a good expectation that most pages which are redirected and still contain text will still contain their old templates, so they pop up.

Breadcrumb problems
Generally, all breadcrumbs should point all the way back to a continent. But sometimes they stop early. There are several causes of these breadcrumb problems.
 * isin/ispartof is pointing to a redirect page. Breadcrumbs do not survie past a redirect. Change the isin/ispartof to the real pagename.
 * isin/ispartof is pointing to a non-existent page. This may be a typo, or it may be pointing to a page which ought to be there but has never been created. If the page should exist, create it, If it is a typo, correct it. Or it may be that this place really should point to somplace else anyway. Then change it.
 * isin/ispartof is malformed. IsIn should have underscores and codes, no spaces or parentheses. IspartOf is the reverse. If it does't work, fix it. Usually easiest to change to ispartof and format it this way. Or they may have nissed the pipe, or used brackets for braces, or just one brace istead of two.
 * isin/ispartof is pointing to a page just fine, but that page doesn't have an isin/ispartof, or the one on that page is broken. You can select the higher page from the bredcrumb on this page, and then need to fix that one.

And of course it is also possible you just have a buffering problem which needs pages to be purged.

Itinerary
If the page is about a specific Road then it is considered itineraries, such as Interstate-64. But if page is about the area around the road, then it is a region, even if the pagename matches the road. Examples in the page name are road, highway, route, parkway. If you find a road, it should be added to the official list of roads at Wikivoyage:Routes_Expedition#Articles.

airports
There are only four or five airport articles. Not sure if they should be district of their city, a separate city, or a region article (probably not region). wikivoyage poilicies need clarification.

cruiseships
These show up in the list of pages which have no isin or ispartof, so they get identified easily. wikivoyage policies remain in flux for these, so I'm leaving them alone for adding type templates.

seedistrict printdistrict
Once all the districts are identified, it will be easy to get a list of cities which have districts. Then can determine if they should have the print and see district templates added. seedistrict is on almost none right now. But only add if the distrcict articles really do contain listings. Ig the distrcit articles are just outlines without the listings being divided, then it wuld be too earliy to add these.

If the city turns out to only have one district article, maybe it should be considered for whether page is appropriate or vfd.

Why do all this

 * First, getting the templates correct permits the pages to be cleaner. They sometimes affect display. For example, if "guide' is on a traveltopic record then it tells all about how the traveltopic contains hotel and restaurant listings. So getting guide changed to guideetopic cleans this up.


 * Second, if the templates are assigned, then it is possible to get a list of things like cities with districts, or outline itineraries, or countries contained in the system.


 * Third, if the templates are well assigned then you can run statistics from the database, such as how many cities or how many regions represented. What percent of parks are usable or better. Or other assessments.


 * Fourth, you can include template names in a search, so you can limit your results if you know how.