Sto usando Postgres 9.3.5 e ho una grande tabella nel database, attualmente ha oltre 25 milioni di righe e tende a diventare ancora più grande rapidamente. Sto cercando di selezionare righe specifiche (tutte unit_id
con solo le ultime unit_timestamp
per ognuna di esse) con una semplice query come:
SELECT unit_id, max(unit_timestamp) AS latest_timestamp FROM all_units GROUP BY unit_id;
Senza alcun indice, l'esecuzione di questa query richiede circa 35 secondi. Con un indice definito (CREATE INDEX partial_idx ON all_units (unit_id, unit_timestamp DESC);
) il tempo di interrogazione è ridotto a circa (solo) 19 secondi.
Mi chiedo se sarà mai possibile eseguire la mia query in ancora meno tempo (come solo pochi secondi) e in tal caso, quali passi devo prendere per ottimizzarla ulteriormente?
Il dump della struttura della tabella è simile al seguente:
CREATE TABLE "all_units" (
"unit_id" int4 NOT NULL,
"unit_timestamp" timestamp(6) NOT NULL,
"lon" float4,
"lat" float4,
"speed" float4,
"status" varchar(255) COLLATE "default"
)
ALTER TABLE "all_units" ADD PRIMARY KEY ("unit_id", "unit_timestamp");
Quanto EXPLAIN (ANALYZE, BUFFERS)
segue:
QUERY PLAN
---------------------------------------------------------------------------------------------------------------------------------
HashAggregate (cost=663998.38..664069.73 rows=7135 width=12) (actual time=84715.050..84732.021 rows=11094 loops=1)
Buffers: shared hit=192 read=286819
-> Seq Scan on ais_sorted (cost=0.00..538335.92 rows=25132492 width=12) (actual time=0.608..41264.196 rows=25132492 loops=1)
Buffers: shared hit=192 read=286819
Total runtime: 84746.501 ms
e le mie impostazioni psql sul server si presentano così:
name | context | min_val | max_val | boot_val
-------------------------------------+------------+-----------+--------------+-----------------------------------------
allow_system_table_mods | postmaster | | | off
application_name | user | | |
archive_command | sighup | | |
archive_mode | postmaster | | | off
archive_timeout | sighup | 0 | 1073741823 | 0
array_nulls | user | | | on
authentication_timeout | sighup | 1 | 600 | 60
autovacuum | sighup | | | on
autovacuum_analyze_scale_factor | sighup | 0 | 100 | 0.1
autovacuum_analyze_threshold | sighup | 0 | 2147483647 | 50
autovacuum_freeze_max_age | postmaster | 100000000 | 2000000000 | 200000000
autovacuum_max_workers | postmaster | 1 | 8388607 | 3
autovacuum_multixact_freeze_max_age | postmaster | 10000000 | 2000000000 | 400000000
autovacuum_naptime | sighup | 1 | 2147483 | 60
autovacuum_vacuum_cost_delay | sighup | -1 | 100 | 20
autovacuum_vacuum_cost_limit | sighup | -1 | 10000 | -1
autovacuum_vacuum_scale_factor | sighup | 0 | 100 | 0.2
autovacuum_vacuum_threshold | sighup | 0 | 2147483647 | 50
backslash_quote | user | | | safe_encoding
bgwriter_delay | sighup | 10 | 10000 | 200
bgwriter_lru_maxpages | sighup | 0 | 1000 | 100
bgwriter_lru_multiplier | sighup | 0 | 10 | 2
block_size | internal | 8192 | 8192 | 8192
bonjour | postmaster | | | off
bonjour_name | postmaster | | |
bytea_output | user | | | hex
check_function_bodies | user | | | on
checkpoint_completion_target | sighup | 0 | 1 | 0.5
checkpoint_segments | sighup | 1 | 2147483647 | 3
checkpoint_timeout | sighup | 30 | 3600 | 300
checkpoint_warning | sighup | 0 | 2147483647 | 30
client_encoding | user | | | SQL_ASCII
client_min_messages | user | | | notice
commit_delay | superuser | 0 | 100000 | 0
commit_siblings | user | 0 | 1000 | 5
config_file | postmaster | | |
constraint_exclusion | user | | | partition
cpu_index_tuple_cost | user | 0 | 1.79769e+308 | 0.005
cpu_operator_cost | user | 0 | 1.79769e+308 | 0.0025
cpu_tuple_cost | user | 0 | 1.79769e+308 | 0.01
cursor_tuple_fraction | user | 0 | 1 | 0.1
data_checksums | internal | | | off
data_directory | postmaster | | |
DateStyle | user | | | ISO, MDY
db_user_namespace | sighup | | | off
deadlock_timeout | superuser | 1 | 2147483647 | 1000
debug_assertions | user | | | off
debug_pretty_print | user | | | on
debug_print_parse | user | | | off
debug_print_plan | user | | | off
debug_print_rewritten | user | | | off
default_statistics_target | user | 1 | 10000 | 100
default_tablespace | user | | |
default_text_search_config | user | | | pg_catalog.simple
default_transaction_deferrable | user | | | off
default_transaction_isolation | user | | | read committed
default_transaction_read_only | user | | | off
default_with_oids | user | | | off
dynamic_library_path | superuser | | | $libdir
effective_cache_size | user | 1 | 2147483647 | 16384
effective_io_concurrency | user | 0 | 1000 | 1
enable_bitmapscan | user | | | on
enable_hashagg | user | | | on
enable_hashjoin | user | | | on
enable_indexonlyscan | user | | | on
enable_indexscan | user | | | on
enable_material | user | | | on
enable_mergejoin | user | | | on
enable_nestloop | user | | | on
enable_seqscan | user | | | on
enable_sort | user | | | on
enable_tidscan | user | | | on
escape_string_warning | user | | | on
event_source | postmaster | | | PostgreSQL
exit_on_error | user | | | off
external_pid_file | postmaster | | |
extra_float_digits | user | -15 | 3 | 0
from_collapse_limit | user | 1 | 2147483647 | 8
fsync | sighup | | | on
full_page_writes | sighup | | | on
geqo | user | | | on
geqo_effort | user | 1 | 10 | 5
geqo_generations | user | 0 | 2147483647 | 0
geqo_pool_size | user | 0 | 2147483647 | 0
geqo_seed | user | 0 | 1 | 0
geqo_selection_bias | user | 1.5 | 2 | 2
geqo_threshold | user | 2 | 2147483647 | 12
gin_fuzzy_search_limit | user | 0 | 2147483647 | 0
hba_file | postmaster | | |
hot_standby | postmaster | | | off
hot_standby_feedback | sighup | | | off
ident_file | postmaster | | |
ignore_checksum_failure | superuser | | | off
ignore_system_indexes | backend | | | off
integer_datetimes | internal | | | on
IntervalStyle | user | | | postgres
join_collapse_limit | user | 1 | 2147483647 | 8
krb_caseins_users | sighup | | | off
krb_server_keyfile | sighup | | | FILE:/etc/postgresql-common/krb5.keytab
krb_srvname | sighup | | | postgres
lc_collate | internal | | | C
lc_ctype | internal | | | C
lc_messages | superuser | | |
lc_monetary | user | | | C
lc_numeric | user | | | C
lc_time | user | | | C
listen_addresses | postmaster | | | localhost
lo_compat_privileges | superuser | | | off
local_preload_libraries | backend | | |
lock_timeout | user | 0 | 2147483647 | 0
log_autovacuum_min_duration | sighup | -1 | 2147483647 | -1
log_checkpoints | sighup | | | off
log_connections | backend | | | off
log_destination | sighup | | | stderr
log_directory | sighup | | | pg_log
log_disconnections | backend | | | off
log_duration | superuser | | | off
log_error_verbosity | superuser | | | default
log_executor_stats | superuser | | | off
log_file_mode | sighup | 0 | 511 | 384
log_filename | sighup | | | postgresql-%Y-%m-%d_%H%M%S.log
log_hostname | sighup | | | off
log_line_prefix | sighup | | |
log_lock_waits | superuser | | | off
log_min_duration_statement | superuser | -1 | 2147483647 | -1
log_min_error_statement | superuser | | | error
log_min_messages | superuser | | | warning
log_parser_stats | superuser | | | off
log_planner_stats | superuser | | | off
log_rotation_age | sighup | 0 | 35791394 | 1440
log_rotation_size | sighup | 0 | 2097151 | 10240
log_statement | superuser | | | none
log_statement_stats | superuser | | | off
log_temp_files | superuser | -1 | 2147483647 | -1
log_timezone | sighup | | | GMT
log_truncate_on_rotation | sighup | | | off
logging_collector | postmaster | | | off
maintenance_work_mem | user | 1024 | 2147483647 | 16384
max_connections | postmaster | 1 | 8388607 | 100
max_files_per_process | postmaster | 25 | 2147483647 | 1000
max_function_args | internal | 100 | 100 | 100
max_identifier_length | internal | 63 | 63 | 63
max_index_keys | internal | 32 | 32 | 32
max_locks_per_transaction | postmaster | 10 | 2147483647 | 64
max_pred_locks_per_transaction | postmaster | 10 | 2147483647 | 64
max_prepared_transactions | postmaster | 0 | 8388607 | 0
max_stack_depth | superuser | 100 | 2147483647 | 100
max_standby_archive_delay | sighup | -1 | 2147483647 | 30000
max_standby_streaming_delay | sighup | -1 | 2147483647 | 30000
max_wal_senders | postmaster | 0 | 8388607 | 0
password_encryption | user | | | on
port | postmaster | 1 | 65535 | 5432
post_auth_delay | backend | 0 | 2147 | 0
pre_auth_delay | sighup | 0 | 60 | 0
quote_all_identifiers | user | | | off
random_page_cost | user | 0 | 1.79769e+308 | 4
restart_after_crash | sighup | | | on
search_path | user | | | "$user",public
segment_size | internal | 131072 | 131072 | 131072
seq_page_cost | user | 0 | 1.79769e+308 | 1
server_encoding | internal | | | SQL_ASCII
server_version | internal | | | 9.3.5
server_version_num | internal | 90305 | 90305 | 90305
session_replication_role | superuser | | | origin
shared_buffers | postmaster | 16 | 1073741823 | 1024
shared_preload_libraries | postmaster | | |
sql_inheritance | user | | | on
ssl | postmaster | | | off
ssl_ca_file | postmaster | | |
ssl_cert_file | postmaster | | | server.crt
ssl_ciphers | postmaster | | | DEFAULT:!LOW:!EXP:!MD5:@STRENGTH
ssl_crl_file | postmaster | | |
ssl_key_file | postmaster | | | server.key
ssl_renegotiation_limit | user | 0 | 2147483647 | 524288
standard_conforming_strings | user | | | on
statement_timeout | user | 0 | 2147483647 | 0
stats_temp_directory | sighup | | | pg_stat_tmp
superuser_reserved_connections | postmaster | 0 | 8388607 | 3
synchronize_seqscans | user | | | on
synchronous_commit | user | | | on
synchronous_standby_names | sighup | | |
syslog_facility | sighup | | | local0
syslog_ident | sighup | | | postgres
tcp_keepalives_count | user | 0 | 2147483647 | 0
tcp_keepalives_idle | user | 0 | 2147483647 | 0
tcp_keepalives_interval | user | 0 | 2147483647 | 0
temp_buffers | user | 100 | 1073741823 | 1024
temp_file_limit | superuser | -1 | 2147483647 | -1
temp_tablespaces | user | | |
TimeZone | user | | | GMT
timezone_abbreviations | user | | |
trace_notify | user | | | off
trace_recovery_messages | sighup | | | log
trace_sort | user | | | off
track_activities | superuser | | | on
track_activity_query_size | postmaster | 100 | 102400 | 1024
track_counts | superuser | | | on
track_functions | superuser | | | none
track_io_timing | superuser | | | off
transaction_deferrable | user | | | off
transaction_isolation | user | | | default
transaction_read_only | user | | | off
transform_null_equals | user | | | off
unix_socket_directories | postmaster | | | /var/run/postgresql
unix_socket_group | postmaster | | |
unix_socket_permissions | postmaster | 0 | 511 | 511
update_process_title | superuser | | | on
vacuum_cost_delay | user | 0 | 100 | 0
vacuum_cost_limit | user | 1 | 10000 | 200
vacuum_cost_page_dirty | user | 0 | 10000 | 20
vacuum_cost_page_hit | user | 0 | 10000 | 1
vacuum_cost_page_miss | user | 0 | 10000 | 10
vacuum_defer_cleanup_age | sighup | 0 | 1000000 | 0
vacuum_freeze_min_age | user | 0 | 1000000000 | 50000000
vacuum_freeze_table_age | user | 0 | 2000000000 | 150000000
vacuum_multixact_freeze_min_age | user | 0 | 1000000000 | 5000000
vacuum_multixact_freeze_table_age | user | 0 | 2000000000 | 150000000
wal_block_size | internal | 8192 | 8192 | 8192
wal_buffers | postmaster | -1 | 2147483647 | -1
wal_keep_segments | sighup | 0 | 2147483647 | 0
wal_level | postmaster | | | minimal
wal_receiver_status_interval | sighup | 0 | 2147483 | 10
wal_receiver_timeout | sighup | 0 | 2147483647 | 60000
wal_segment_size | internal | 2048 | 2048 | 2048
wal_sender_timeout | sighup | 0 | 2147483647 | 60000
wal_sync_method | sighup | | | fdatasync
wal_writer_delay | sighup | 1 | 10000 | 200
work_mem | user | 64 | 2147483647 | 1024
xmlbinary | user | | | base64
xmloption | user | | | content
zero_damaged_pages | superuser | | | off
SELECT DISTINCT ON (unit_id),unit_timestamp FROM t ORDER BY unit_timestamp DESC
Un indice separato su unit_timestamp sarebbe di aiuto.