Qmysql:我可以做一个左连接并且只从连接表中拉一行吗?

我为工作写了一个定制的服务台,它运行得很好…直到最近。有一个查询的速度确实减慢了。现在大概要14秒!以下是相关表格:

CREATE TABLE `tickets` (
  `id` int(11) unsigned NOT NULL DEFAULT '0',
  `date_submitted` datetime DEFAULT NULL,
  `date_closed` datetime DEFAULT NULL,
  `first_name` varchar(50) DEFAULT NULL,
  `last_name` varchar(50) DEFAULT NULL,
  `email` varchar(50) DEFAULT NULL,
  `description` text,
  `agent_id` smallint(5) unsigned NOT NULL DEFAULT '1',
  `status` smallint(5) unsigned NOT NULL DEFAULT '1',
  `priority` tinyint(4) NOT NULL DEFAULT '0',
  PRIMARY KEY (`id`),
  KEY `date_closed` (`date_closed`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;

CREATE TABLE `solutions` (
  `id` int(10) unsigned NOT NULL,
  `ticket_id` mediumint(8) unsigned DEFAULT NULL,
  `date` datetime DEFAULT NULL,
  `hours_spent` float DEFAULT NULL,
  `agent_id` smallint(5) unsigned DEFAULT NULL,
  `body` text,
  PRIMARY KEY (`id`),
  KEY `ticket_id` (`ticket_id`),
  KEY `date` (`date`),
  KEY `hours_spent` (`hours_spent`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;

当用户提交票据时,它进入“票据”表。然后,当特工解决问题时,他们记录下他们采取的行动。每个条目进入“解决方案”表。换句话说,门票有很多解决方案。

查询速度变慢的目的是从“tickets”表中提取所有字段,以及从“solutions”表中提取最新条目。这是我一直在使用的查询:

SELECT tickets.*,
    (SELECT CONCAT_WS(" * ", DATE_FORMAT(solutions.date, "%c/%e/%y"), solutions.hours_spent, CONCAT_WS(": ", solutions.agent_id, solutions.body))
    FROM solutions
    WHERE solutions.ticket_id = tickets.id
    ORDER BY solutions.date DESC, solutions.id DESC
    LIMIT 1
) AS latest_solution_entry
FROM tickets
WHERE tickets.date_closed IS NULL
OR tickets.date_closed >= '2012-06-20 00:00:00'
ORDER BY tickets.id DESC

下面是“最新解决方案”字段的示例:

6/20/12 * 1337 * 1: I restarted the computer and that fixed the problem. Yes, I took an hour to do this.

在php中,我将“最新解决方案”字段拆分并正确格式化。

当我注意到运行查询的页面已经慢下来时,我运行查询时没有使用子查询,而且速度非常快。然后我对原始查询运行EXPLAIN,得到如下结果:

+----+--------------------+-----------+-------+---------------+-----------+---------+---------------------+-------+-----------------------------+
| id | select_type        | table     | type  | possible_keys | key       | key_len | ref                 | rows  | Extra                       |
+----+--------------------+-----------+-------+---------------+-----------+---------+---------------------+-------+-----------------------------+
|  1 | PRIMARY            | tickets   | index | date_closed   | PRIMARY   | 4       | NULL                | 35804 | Using where                 |
|  2 | DEPENDENT SUBQUERY | solutions | ref   | ticket_id     | ticket_id | 4       | helpdesk.tickets.id |     1 | Using where; Using filesort |
+----+--------------------+-----------+-------+---------------+-----------+---------+---------------------+-------+-----------------------------+

因此,我正在寻找一种方法,使我的查询更有效,但仍然达到相同的目标。有什么想法吗?

2012-06-21 03:47:52  Nick

A回答

  • 1

    让我总结一下我的理解:您想选择每张票及其最后的解决方案。

    我喜欢对这种问题使用下面的模式,因为它避免了子查询模式,因此在需要性能的地方非常好。缺点是理解起来有点棘手:

    SELECT
      t.*,
      s1.*
    FROM tickets t
    INNER JOIN solutions s1 ON t.id = s1.ticket_id
    LEFT JOIN solutions s2 ON s1.ticket_id = s2.ticket_id AND s2.id > s1.id
    WHERE s2.id IS NULL;
    

    为了更好地理解,我只写了模式的核心部分。

    关键是:

    • solutions表与自身的左连接具有s1.ticket_id = s2.ticket_id条件:它模拟GROUP BY ticket_id

    • 条件s2.id > s1.id:它是我只想要最后一个解决方案的sql,它模拟MAX()。我假设在您的模型中,the last的意思是with the greatest id,但是您可以在这里使用日期的条件。注意s2.id < s1.id将为您提供第一个解决方案。

    • >P.WHERE子句s2.id IS NULL:最奇怪的但绝对必要的…只保留您想要的记录。

    试着告诉我:)

    编辑1:我刚刚意识到第二点假设过于简单化了问题。这使它更加有趣:p我正试图了解这种模式如何与您的date, id排序一起工作。

    编辑2:好的,稍微旋转一下就行了。左连接的条件变为:

    LEFT JOIN solutions s2 ON s1.ticket_id = s2.ticket_id
      AND (s2.date > s1.date OR (s2.date = s1.date AND s2.id > s1.id))
    
    2012-06-21 08:32:18  Olivier Coilland
  • 2

    当您在SELECT子句中有内联视图时,它必须为每一行执行该选择。在这种情况下,我发现在OFF子句中插入内联视图会更好,而不是执行一次选择。

    SELECT t.*, 
           Concat_ws(" * ", Date_format(s.date, "%c/%e/%y"), s.hours_spent, 
           Concat_ws(":", s.agent_id, s.body)) 
    FROM   tickets t 
           INNER JOIN (SELECT solutions.ticket_id,
                              Max(solutions.date) maxdate 
                       FROM   solutions 
                       GROUP  BY solutions.ticket_id) last_solutions 
                   ON t.id = last_solutions.ticket_id
           INNER JOIN (SELECT solutions.ticket_id,
                              solutions.date,
                              Max(solutions.id) maxid 
                       FROM   solutions 
                       GROUP  BY solutions.ticket_id,
                                solutions.date) last_solution
                  ON last_solutions.ticket_id = last_solution.ticket_id 
                     and last_solutions.maxDate = last_solution.Date
           INNER JOIN solutions s 
                   ON last_solution.maxid = s.id
    WHERE  t.date_closed IS NULL 
            OR t.date_closed >= '2012-06-20 00:00:00' 
    ORDER  BY t.id DESC 
    

    注意:您可能需要根据需要将其设为左连接

    2012-06-21 04:52:02  Conrad Frix
  • 3

    试试这个:

    SELECT *
    FROM (
      -- for each ticket get the most recent solution date
      SELECT ticket_id, MAX(solutions.date) as date
      FROM solutions
      GROUP BY ticket_id
    ) t
    JOIN tickets ON t.ticket_id = tickets.id
    WHERE tickets.date_closed IS NULL OR tickets.date_closed >= '2012-06-20 00:00:00'
    ORDER BY tickets.id DESC
    

    请注意,如果有一张带有同一日期的两个解决方案的票据,则您的结果集中会有重复的记录。您将需要另一个连接来删除这些重复项,或者使用一个绝对序列,如序列(递增主键)。

    2012-06-21 06:20:24  Elliot Chance
  • 4

    根据目的,我给出了一个想法:

    SELECT DISTINCT s1.ticket_id, t.*,  s1.*
    FROM tickets t
    LEFT JOIN solutions s1 ON t.id = s1.ticket_id
    
    2013-10-30 09:41:43  hvtruong