Products
Applications
Support
Company
How To Buy
Skip to main content (Press Enter).
Sign in
Skip auxiliary navigation (Press Enter).
Register
Skip main navigation (Press Enter).
Toggle navigation
Home
Communities
All Communities
Application Networking and Security
Carbon Black
Enterprise Software
Mainframe Software
Symantec Enterprise
Tanzu
VMware {code}
VMware Cloud Foundation
Blogs
All Blogs
Enterprise Software
Mainframe Software
Symantec Enterprise
VMware
Events
VMware Explore 2025
All Events
Enterprise Software
Mainframe Software
Symantec Enterprise
VMware Cloud Foundation Events
Water Cooler
Betas
Flings
Education
Groups
Enterprise Software
Mainframe Software
Symantec Enterprise
VMware
Members
VMware vSphere
View Only
Community Home
Threads
703K
Library
2.7K
Blogs
0
Events
0
Members
100K
Back to discussions
Expand all
|
Collapse all
sort by most recent
sort by thread
linux postmaster high cpu
Jump to
Best Answer
ganderson_hyper
Jun 05, 2007 10:04 PM
I have a postmaster proc on a linux server that starts using 100% of the cpu within a few hours after ...
ganderson_hyper
Jun 05, 2007 10:17 PM
I am running the internal db and HQ v3.0.4 (build #389 - Apr 27, 2007 - Relase Build) --Thanks
jtravis_hyperic
Jun 06, 2007 05:28 PM
Hey Garry, Does the high load ever subside? It's possible that HQ is performing a vaccuming ...
ganderson_hyper
Jun 06, 2007 05:59 PM
Jon, No, it has been running hard for days. Load avg never below 1.25. I have, the offending, ...
jtravis_hyperic
Jun 07, 2007 12:07 AM
Weird. Normal operation shouldn't give you a heavy load unless you're monitoring a huge amount ...
ganderson_hyper
Jun 11, 2007 04:06 PM
Jon, Thanks. Yes, HQ and DB on same box, I have about 21 servers with 803 services --Thanks ...
admin
Jun 11, 2007 08:26 PM
Maybe there is a query that is spinning? You can check the current running queries by running: ...
ganderson_hyper
Jun 11, 2007 09:31 PM
Ryan, Thanks, I have attached the output from the sql. The only thing in the logs is: ...
ganderson_hyper
Jun 11, 2007 09:33 PM
Sorry forgot to attach the file to the last reply
ganderson_hyper
Jun 11, 2007 10:16 PM
Also, The UI is very slow, takes 2-4 minutes to bring up basic graphs for the 24 hours. The ...
admin
Jun 11, 2007 10:24 PM
Thanks for the additional info. I need you to do one more thing though to make it useful. ...
ganderson_hyper
Jun 12, 2007 02:35 PM
Thanks here is the query... this has been running since a few minutes after I restarted the ...
admin
Jun 12, 2007 03:37 PM
That query is what is run when the metric compression routine runs each hour. It's trying ...
ganderson_hyper
Jun 12, 2007 06:26 PM
hmmm havn't had any issues, but tried the following: hqdb=# select count(*) from eam_measurement_data; ...
admin
Jun 12, 2007 06:52 PM
Best Answer
> hqdb=# select count(*) from eam_measurement_data; > ERROR: could not access status ...
ganderson_hyper
Jun 12, 2007 07:23 PM
Thanks. Funny thing is that the files in pg_clog look sequential with the highest number ...
ganderson_hyper
Jun 12, 2007 10:45 PM
Ryan, Thank you for all your help. I followed the link that you gave and it said to ...
admin
Jun 12, 2007 11:35 PM
Great! Glad you were able to get to the bottom of the problem. I would continue to ...
1.
linux postmaster high cpu
Recommend
ganderson_hyper
Posted Jun 05, 2007 10:04 PM
Reply
Reply Privately
Options Dropdown
I have a postmaster proc on a linux server that starts using 100% of the cpu within a few hours after a reboot.
Is this normal?
I am running 2gb memory with 2HT intel CPU's, but the charts seem to take 1 - 2 minutes to generate.
--Thanks Garry
2.
RE: linux postmaster high cpu
Recommend
ganderson_hyper
Posted Jun 05, 2007 10:17 PM
Reply
Reply Privately
Options Dropdown
I am running the internal db and HQ v3.0.4 (build #389 - Apr 27, 2007 - Relase Build)
--Thanks
3.
RE: linux postmaster high cpu
Recommend
jtravis_hyperic
Posted Jun 06, 2007 05:28 PM
Reply
Reply Privately
Options Dropdown
Hey Garry,
Does the high load ever subside? It's possible that HQ is performing
a vaccuming operation which usually porks the DB pretty hard.
Eventually, though it should go back down.
-- Jon
On Jun 5, 2007, at 3:04 PM, Garry Anderson wrote:
> I have a postmaster proc on a linux server that starts using 100%
> of the cpu within a few hours after a reboot.
>
> Is this normal?
>
> I am running 2gb memory with 2HT intel CPU's, but the charts seem
> to take 1 - 2 minutes to generate.
>
> --Thanks Garry
>
4.
RE: linux postmaster high cpu
Recommend
ganderson_hyper
Posted Jun 06, 2007 05:59 PM
Reply
Reply Privately
Options Dropdown
Jon,
No, it has been running hard for days. Load avg never below 1.25.
I have, the offending, postmaster process that has used 32668:25 (22 days) of cpu in 22 days of uptime for the system.
--thanks Garry
5.
RE: linux postmaster high cpu
Recommend
jtravis_hyperic
Posted Jun 07, 2007 12:07 AM
Reply
Reply Privately
Options Dropdown
Weird. Normal operation shouldn't give you a heavy load unless
you're monitoring a huge amount of things. How many platforms are
you monitoring? I assume that you're running the DB and HQ on the
same box?
-- Jon
On Jun 6, 2007, at 10:58 AM, Garry Anderson wrote:
> Jon,
> No, it has been running hard for days. Load avg never below 1.25.
>
> I have, the offending, postmaster process that has used 32668:25
> (22 days) of cpu in 22 days of uptime for the system.
>
> --thanks Garry
>
6.
RE: linux postmaster high cpu
Recommend
ganderson_hyper
Posted Jun 11, 2007 04:06 PM
Reply
Reply Privately
Options Dropdown
Jon,
Thanks. Yes, HQ and DB on same box, I have about 21 servers with 803 services
--Thanks Garry
7.
RE: linux postmaster high cpu
Recommend
admin
Posted Jun 11, 2007 08:26 PM
Reply
Reply Privately
Options Dropdown
Maybe there is a query that is spinning? You can check the current running queries by running:
bash> server-3.0.x/bin/db-psql.sh
hqdb> select * from pg_stat_activity where current_query not like '%IDLE%';
Attach the output from that command here. You can also check the hqdb.log file for anything suspicious. What effect does this have on your monitored systems in HQ? Does everything appear normal from the UI?
-Ryan
8.
RE: linux postmaster high cpu
Recommend
ganderson_hyper
Posted Jun 11, 2007 09:31 PM
Reply
Reply Privately
Options Dropdown
Ryan,
Thanks, I have attached the output from the sql.
The only thing in the logs is:
[2007-06-11 15:20:52.410 MDT] ERROR: duplicate key violates unique constraint "eam_measurement_data_pkey"
[2007-06-11 15:20:52.419 MDT] ERROR: duplicate key violates unique constraint "eam_measurement_data_pkey"
but I think this is normal, seems to me I came across this in another forum
Thanks!
--Garry
9.
RE: linux postmaster high cpu
Recommend
ganderson_hyper
Posted Jun 11, 2007 09:33 PM
Reply
Reply Privately
Options Dropdown
Sorry forgot to attach the file to the last reply
10.
RE: linux postmaster high cpu
Recommend
ganderson_hyper
Posted Jun 11, 2007 10:16 PM
Reply
Reply Privately
Options Dropdown
Also, The UI is very slow, takes 2-4 minutes to bring up basic graphs for the 24 hours. The monitored systems don't seem to be affected, seem to be getting all data.
--Thanks Garry
11.
RE: linux postmaster high cpu
Recommend
admin
Posted Jun 11, 2007 10:24 PM
Reply
Reply Privately
Options Dropdown
Thanks for the additional info. I need you to do one more thing though to make it useful. Can you edit: hqdb/data/postgresql.conf
And uncomment the line that says:
# stats_command_string = on
Then restart the HQ server? This will enable the statement stats so we can see them from the above command. Once the server starts spinning again, rerun the SQL above and attach it to this thread.
-Ryan
12.
RE: linux postmaster high cpu
Recommend
ganderson_hyper
Posted Jun 12, 2007 02:35 PM
Reply
Reply Privately
Options Dropdown
Thanks here is the query... this has been running since a few minutes after I restarted the HQ-Server, and has used 14 hours of cpu since then... about how long the server has been running.
--Thanks Garry
hqdb=# select * from pg_stat_activity where current_query not like '%IDLE%';
datid | datname | procpid | usesysid | usename | current_query | query_start | backend_start | client_addr | client_port
-------+---------+---------+----------+---------+------------------------------------------------------------------------------------------------+-------------------------------+-------------------------------+------------------+-------------
16384 | hqdb | 4071 | 16385 | hqadmin | BEGIN;DELETE FROM EAM_MEASUREMENT_DATA WHERE timestamp BETWEEN 1178496000000 AND 1178499600000 | 2007-06-11 17:36:44.903943-06 | 2007-06-11 16:32:11.340559-06 | ::ffff:127.0.0.1 | 50046
(1 row)
13.
RE: linux postmaster high cpu
Recommend
admin
Posted Jun 12, 2007 03:37 PM
Reply
Reply Privately
Options Dropdown
That query is what is run when the metric compression routine runs each hour. It's trying to delete 1 hour's worth of data from your detailed measurement table. Has your HQ installation ever run without the high CPU?
I'm wondering of somehow that table has become corrupted. The delete should not be taking so long. What type of storage is backing the database?
If you just want to get things up and functional again, it's pretty easy to just truncate that table:
hqdb=# truncate table eam_measurement_data;
hqdb=# vacuum analyze eam_measurement_data;
hqdb=# reindex table eam_measurement_data;
-Ryan
14.
RE: linux postmaster high cpu
Recommend
ganderson_hyper
Posted Jun 12, 2007 06:26 PM
Reply
Reply Privately
Options Dropdown
hmmm havn't had any issues, but tried the following:
hqdb=# select count(*) from eam_measurement_data;
ERROR: could not access status of transaction 576717946
DETAIL: could not open file "pg_clog/0226": No such file or directory
sure enought that file is not to be found.
The disk is a raid5 dell perc 5i controller. I havn't seen any errors in the logs.
Should I still run the cmds to truncate that table?
I wonder if their are any other tables that are corrupt, assuming that the above error is "corruption"
I really havn't done anything with this server, it has not crashed and the raid is nominal. I have upgraded the HQ-Server 4 times and that has all gone ok.
Their is plenty of disk space.
--Thanks Garry
15.
RE: linux postmaster high cpu
Best Answer
Recommend
admin
Posted Jun 12, 2007 06:52 PM
Reply
Reply Privately
Options Dropdown
> hqdb=# select count(*) from eam_measurement_data;
> ERROR: could not access status of transaction
> 576717946
> DETAIL: could not open file "pg_clog/0226": No such
> file or directory
>
That looks like the problem. I wonder if maybe it's just the index that's out of sync? Before we go down the path of truncating lets try to reindex that table to see if that fixes the problem:
hqdb=# reindex table eam_measurement_data;
hqdb=# select count(*) from eam_measurement_data;
If you continue to have problems, we'll have to do some surgery on your DB. (Assuming you don't have a recent backup).
I just ran across this thread on the PG user forums with a similar issue:
http://archives.postgresql.org/pgsql-general/2006-07/msg01061.php
I'll do some more digging.
Thanks Gary,
-Ryan
16.
RE: linux postmaster high cpu
Recommend
ganderson_hyper
Posted Jun 12, 2007 07:23 PM
Reply
Reply Privately
Options Dropdown
Thanks.
Funny thing is that the files in pg_clog look sequential with the highest number being: 0050 (hex sequence) and this file is 0226. Why such a big gap?
I started the reindex.... it has been running over an hour... wonder if it will finish!
--Thanks Garry
17.
RE: linux postmaster high cpu
Recommend
ganderson_hyper
Posted Jun 12, 2007 10:45 PM
Reply
Reply Privately
Options Dropdown
Ryan,
Thank you for all your help. I followed the link that you gave and it said to create empty files for the missing pg_clog\* files.
I had 4 missing. Seems to be running a lot better, but still wonder if I should just rebuild the DB.
The system seems much better, the vaccuming seems to be much better.
Deleted some 25 million rows, still have 8 million, GUI is faster.
I have one row with the timestamp<0 ... should I delete it?
Am missing data for all the servers, but don't really care.
I was talking to a co-worker and he remined me that the prior owner of this project "accidently" deleted some data... that must have been what happened.
--Thanks everyone for all your help
--Garry
18.
RE: linux postmaster high cpu
Recommend
admin
Posted Jun 12, 2007 11:35 PM
Reply
Reply Privately
Options Dropdown
Great! Glad you were able to get to the bottom of the problem.
I would continue to monitor it to make sure everything is working properly. Rebuilding the DB can take some time, so I would only resort to that if it's really necessary.
Curious how that row with a negative timestamp got in there, but it should be safe to remove it. :)
-Ryan
×
New Best Answer
This thread already has a best answer. Would you like to mark this message as the new best answer?
Copyright 2024. All rights reserved.
Powered by Higher Logic