Talk:PostgreSQL
|
|
Does anyone know about the automagick joins feature in this article? There doesn't seem to be one, but I'm not PostgreSQL expert. Eurleif 15:21, Jan 18, 2004 (UTC)
| Contents |
what's with the criticism section?
I took the criticism section completely out (was renamed shortcomings). It was poorly written and inaccurate. The information that was presented there was more approraite for a PostgreSQL faq entry than on this page.
The few bits of good information (for example: a slony mention) would be more appopriate in a 3rd party support section, etc.
Merlin
It looks like it was written by a Mysql fanatic who dug anything he could in order to make Pgsql look bad. Lack of replication support and native win32 ports are indeed important issues, but the rest doesn't merit being mentioned, IMO.
- I agree that the latest item on the "Criticisms" list is badly worded (how about "Due to PostgreSQL's MVCC transaction design, some operations (like count(*)) are slower than one might expect at first."), but I find the criticism list good. All the DBMS pages should have sections on the good/bad sides of the DBMS. That way, Wikipedia may become an interesting and useful encyclopedia, and not just yet another an advocacy advertising pillar. I always feel sceptical when software descriptions don't tell about weaknesses: The rest of the information in such pages becomes dubious. Hopefully, the "Criticisms" section will shrink when PostgreSQL 7.5 is released.
TroelsArvin 18:40, 2 Jun 2004 (UTC)
- regarding Due to PostgreSQL's MVCC. I think the specific technical reasons for one or another shortcoming should not be mentioned because it would make the article too bloated. It should be just a list of things that an average user should be aware about when making a decision to use PGSQL. If you don't like wording, reword it. User:Gene s
- I think the prevous critisim section was better than the present watered-down version. It's also not a good idea to use statements like Mysql fanatic unless you have any facts to substantiate it. I belive it's wrong to scrap factually correct information because it "makes something look bad". If you don't like wording, reword it. Renaming section is fine, otherwise I want to roll back the latest change.
- Well, I was precisely rewording it. The 'Criticism' section has too many details and does not fit well in this article. It gives the impression that the software is incomplete and unuseable, which is not the case. It mentions many database terms such as COUNT(), 'materialized views', 'updateable views' in a prominent manner which the 'average user' will have no idea what they're about. It mentions replication without giving any hint of why it's important. I really don't think you were justified in rolling back the changes to the original version which I consider inadequate for the reasons I just mentioned, so I'll put mine back on. I suppose it could be complemented if you feel the need, however please stick to the important points. Sorry about the 'fanatic' thing. cbraga 13:19, Jun 3, 2004 (UTC)
- No, that's not rewording. You took a factually correct section written by many people over an extended period of time and replaced it with your own which is not as precise or deep. This is not a PGSL advocacy group. It does not matter if correct information makes PGSQL look good or bad. The thing that matters is if it's correct or wrong, readable or not. So, there are two ways to deal with it. (1) Professional, by trying to find a compromise. (2) Unprofessional, by having a roll back fight. I propose the (1) - since you were the last one to roll back my and other people changes, you restore it back and then reword it keeping all the items on the list.
- Well, I was precisely rewording it. The 'Criticism' section has too many details and does not fit well in this article. It gives the impression that the software is incomplete and unuseable, which is not the case. It mentions many database terms such as COUNT(), 'materialized views', 'updateable views' in a prominent manner which the 'average user' will have no idea what they're about. It mentions replication without giving any hint of why it's important. I really don't think you were justified in rolling back the changes to the original version which I consider inadequate for the reasons I just mentioned, so I'll put mine back on. I suppose it could be complemented if you feel the need, however please stick to the important points. Sorry about the 'fanatic' thing. cbraga 13:19, Jun 3, 2004 (UTC)
Gene s 14:14, 3 Jun 2004 (UTC)
- Well then call it rewriting. Look, sometimes it is necessary to rewrite a whole section of an article because it is not adequate. Such was the case with the criticism section for the reasons I stated before. I don't really care if many shorcomings are listed, only that they are factual and actually relevant to the article. Some of the previous itens were simple inappropriate here. Others gave the false impression that PostgreSQL is an incomplete and unusable product. So I rewrote them into something better: factual, correct and concise. I'm sure you can see that the fact that the original section was written by many people over an extended period of time does not mean it will be well written in the end. In this case, it was not. Also, 'Criticism' is a very poor title choice. You can't criticise software any more than you criticise a hammer. Software and hammers have shortcomings. Thanks. cbraga 23:43, Jun 3, 2004 (UTC)
- I have no objection to changing the title. I object to removal of valid points. PGSQL is not suitable for some applications and it should be noted. It's not a good idea to hide problems and let people discover them well into the project. Let's rewrite it some more and put back valid criticism. Gene s 07:07, 4 Jun 2004 (UTC)
I think it's reasonable to compare and contrast the "criticisms" that have been placed in this article with those in the article for MySQL. The MySQL article describes actual criticisms expressed by experts in the field, such as C. J. Date, and the way that MySQL's creators have responded to them. It goes on to describe objective errors in MySQL documentation, and the fact that they have been remedied. It also treats with the response of a part of the user community to MySQL AB's license change. It is not a list of Wikipedia editors' complaints about the product, but a description of a sort of dialogue that has gone on between MySQL's creators and the database community (including MySQL users).
Wikipedia is not supposed to present "original research" or the bare opinions of Wikipedia editors. When we call an article section "Criticisms" it needs to refer to criticisms made by authorities in writing elsewhere. Our own complaints do not fit the bill. --FOo 02:59, 4 Jun 2004 (UTC)
- This is a very good idea. Could you provide links to such authoritive opinions so they can be incorporated into the article? I can't completely agree that user opinions are not worthy of inclusion. And I think that issues considered important by PGSQL developers deserve close attention. For example the matter with poor performance of aggregates takes nearly half of the agregates page (http://www.postgresql.org/docs/7.4/interactive/functions-aggregate.html) in the documentation. Need for VACUUM is also a specific feature of PGSQL which is unexpected.
Gene s 07:07, 4 Jun 2004 (UTC)
The statement and the database functions normally while it runs regarding VACUUM is flat out incorrect with PGSQL 7.4 Gene s 14:32, 4 Jun 2004 (UTC)
- Beginning in PostgreSQL 7.2, the standard form of VACUUM can run in parallel with normal database operations (selects, inserts, updates, deletes, but not changes to table definitions). Routine vacuuming is therefore not nearly as intrusive as it was in prior releases, and it's not as critical to try to schedule it at low-usage times of day.
- cbraga 16:00, Jun 4, 2004 (UTC)
- Can run in parallel is not the same as database functions normally while it runs. Not as critical is simply becuase earlier the database was completely unuseable while it ran. Now it's kind of useful if load is light. It does affect performance A LOT, even with 7.4. Why would it otherwise be adviseable to schedule it at night if it weren't affecting the performance?
- Gene s 07:58, 8 Jun 2004 (UTC)
- Trivially, any other use of the database server "affects performance". That is, if we measure "performance" by responsiveness to user queries, then the presence of any other activity on the server "affects performance". This would include the activity of another user, or the running of a database dump for backup, or even unrelated activity on the same computer as the database server, as well as a VACUUM.
- It's a principle of system administration that one schedules non-time-critical batch processes for times when few time-critical interactive processes are running. This is exactly as true of VACUUM as it is of full backups or any other big, heavy batch process. Making VACUUM sound exceptional in this regard is misleading. --FOo 14:12, 8 Jun 2004 (UTC)
- Absolutely correct. Just like stating that it does not affect performance is also misleading (it was so in the previous edition). The current edition seems to fit the bill. Gene s 14:21, 8 Jun 2004 (UTC)
MVCC
The article currently states:
- "Concurrency is managed via a Multi-Version Concurrency Control (MVCC) design, which ensures excellent performance even under heavy concurrent access"
Isn't it the case that MVCC improves _read_-concurrency (a read will never block or be blocked) while write concurrency isn't affected?
- MVCC improves overall concurrency, since without it a single write operation will lock the full table before proceeding. With MVCC, multiple reads and writes can go on in parallel.
Flawed view of the relational model
I changed a few grand (and incorrect) statements wrt the relational model. For example,
Primary among these was the relational model's
inability to understand "types", combination of
simpler data that make up a single unit. Today we
typically refer to these as objects.
Was totally incorrect becuase:
Merlin
VACUUM
Vacuuming doesn't actually remove the old data, only marks it such that the space can be reused by new data. Hence the size of the database files on disk does not change. To actually shrink those files you need to do VACUUM FULL which will lock the whole database. Also vacuum full is not usually of any advantadge since an active database would grow again. cbraga 02:28, Jun 11, 2004 (UTC)
Type inheritance?
Could someone please confirm that this is actually correct? It seems to state outright that PostgreSQL has type inheritance, and I'm not sure that it does. Currently the 5th paragraph in the Description section is:
- PostgreSQL also allows types to include inheritance, one of the major concepts in object-oriented programming. For instance, one could define a post_code type, and then create us_zip_code and canadian_postal_code based on them. Addresses could then be specialized for us_address and canadian_address, including specialized rules to validate the data in each case.
It's possible that it's referring to table inheritance already described elsewhere, which I don't personally consider quite the same as type inheritance. To the best of my admittedly restricted knowledge, PostgreSQL doesn't have type inheritance. At least, I can't seem to find anything definitive about it on the CREATE TYPE documentation page (http://www.postgresql.org/docs/7.4/static/sql-createtype.html). It only talks about composite types, base types and array types. There's no mention of inheritance in any of them. Is it an undocumented feature?
Izogi 01:56, 20 Jul 2004 (UTC)
This doesn't have anything to do with PostgreSQL
Illustra's product was first introduced in 1991, where it was used in the Sequoia 2000 project late that year. By 1995 the product had added an ability to write plug-in modules they referred to as DataBlades. Unlike other plug-in technologies, with DataBlades external authors could write code to create new low-level datatypes, and tell the database how to store, index and manage it. For instance, one of the most popular DataBlades was used to create a time-series, a list of one particular variable over time, often with gaps. For instance, the price of a stock over time changes, but there are times, like weekends, where the data does not change and there is no entry. Traditional databases have difficultly handling this sort of task; while they can find a record for a particular date, finding the one that is "active" in one of the gaps is time consuming. With the Time Series DataBlade, this was fast and easy.
DataBlades were incredibly successful and started to generate considerable industry "buzz", eventually leading Informix to purchase the company outright in 1996. Industry insiders claimed that it would not be possible to merge the two products, but in fact this was fairly easy because both were based on the original Ingres code and concepts. Informix released their Illustra-based Universal Server in 1997, leaving them in an unchallenged position in terms of technical merit. Roadrunner 22:53, 29 Jul 2004 (UTC)
Added typical PostgreSQL vs. MySQL flame. I tried to summarize the typical PostgeSQL v. MySQL battle (and to point out that there is a flame war). Ultimately every criticism I've ever seen on the issue was boiled down to one side saying that PostgreSQL was bloated and the other side saying that MySQL was a toy.
Roadrunner 23:10, 29 Jul 2004 (UTC)
- You took a section about specific PgSQL shorcomings and turned it into a "typical PostgreSQL vs. MySQL flame" which is cleary NOT what it should be. Your edition appears to be a regression. Gene s 04:06, 30 Jul 2004 (UTC)
"direct understanding of relationships"
I've removed the following text from the page, on the grounds that it is vague and nonsensical. AFAIK, PostgreSQL does not actually have a "direct understanding of the relationships that exist between tables." I suspect that some earlier version of POSTGRES or a related product may have had the capability to do this, but AFAIK PostgreSQL does not. The text is vague about the definition of the tables involved and how exactly this "direct relationship" is established, so perhaps I'm missing something. If someone would care to (a) provide links to the PostgreSQL documentation that discusses this feature (b) provide working SQL that actually takes advantage of this alleged functionality, I'd be happy to re-add the text and rework it to be less vague. Neilc 07:13, 28 Sep 2004 (UTC)
This paragraph is just awful: "The SQL data stores simple data types in "flat tables", requiring the user to gather together related information using queries. " etc A relation is a subset of a cartesian product. Sounds like this was written by an uninformed amateur. user:gtoomey 220.240.152.221 15:13, 19 Jan 2005 (UTC)
Another very useful feature of PostgreSQL involves direct understanding of the relationships that exist between tables. People in the real world typically have several addresses, which the relational model approaches by storing the addresses in one table and the rest of the user information in another. The addresses become "related" to a particular user by storing some unique information, say the user's name, in the address table itself. In order to find all the addresses for "Bob Smith", the user writes a query that "joins" the data back together, by selecting a particular name from the users table and then searching for that name in the address table. Doing a search for all the users in New York can become somewhat complex, requiring the database to find all the user names in the address table, then search the user table for those users. A typical search might look like this:
SELECT u.* FROM user u, address a WHERE a.city='New York' AND a.user_name=u.user_name
PostgreSQL can explicitly define the relationship between users and addresses. Once defined, the address becomes a property of the user, so the search can be greatly simplified to:
SELECT * FROM user WHERE address.city='New York'
This code requires no "join": the database itself understands the user.address relationship.
A related example shows the usefulness of types. If one uses PostgreSQL to do:
SELECT address FROM user
then the database filters the results automatically, returning only those addresses for users, not those for companies or other objects that might also use the address table.
